1P5:换词器


20

这是首次定期Premier Programming难题推送的一部分

游戏

提供了相同长度的开始和结束词。游戏的目的是使用最少的步骤,将起始单词中的一个字母更改为另一个有效单词,重复此步骤直到到达终止单词。例如,给定单词TREE和FLED,输出将是:

TREE
FREE
FLEE
FLED
2

技术指标

  • 有关OWLSOWPODS的维基百科文章就单词列表而言可能是有用的起点。
  • 该程序应支持两种选择开始和结束词的方式:
    1. 用户通过命令行,stdin或任何适合您选择的语言的方式指定(只需提及您在做什么)。
    2. 从文件中随机选择2个单词。
  • 起始词和结束词以及所有临时词的长度均应相同。
  • 每个步骤都应在其行上打印出来。
  • 输出的最后一行应该是开始和结束单词之间所需的临时步骤数。
  • 如果在起始词和结束词之间找不到匹配项,则输出应由3行组成:起始词,结束词和OY词。
  • 在答案中包括大O符号
  • 请包括10个唯一的开始和结束词对(当然还有它们的输出)以显示程序产生的步骤。(为节省空间,虽然程序应在单独的行上输出这些行,但是您可以将它们合并为一行以进行发布,用空格和每次运行之间的逗号替换新行。

进球/获胜标准

  • 最快/最好的Big O解决方案将在一周后产生最短的过渡步骤。
  • 如果由大O标准导致平局,则最短的代码将获胜。
  • 如果仍然有平局,则第一个达到最快和最短修订版本的解决方案将获胜。

测试/样品输出

DIVE
DIME
DAME
NAME
2

PEACE
PLACE
PLATE
SLATE
2

HOUSE
HORSE
GORSE
GORGE
2

POLE
POSE
POST
PAST
FAST
3

验证方式

我正在研究可用于验证输出的脚本。

它会:

  1. 确保每个单词都有效。
  2. 确保每个单词与上一个单词恰好有1个字母不同。

它不会:

  1. 检查是否使用了最短的步骤数。

一旦我写好了,我当然会更新这篇文章。(:


4
对于我来说,执行3项从HOUSE到的GORGE操作的报告为2 似乎让我感到奇怪。我意识到有2个中间词,因此确实有道理,但是操作#会更直观。
马修(Matthew)

4
@Peter,根据sowpods Wikipedia页面,大约有15k个单词,长度超过13个字母
gnibbler,2011年

4
我并不想成为一个知道这一切,但实际上砌有一个名称,它是由刘易斯·卡罗尔发明 en.wikipedia.org/wiki/Word_ladder
st0le

1
您在问题中有一个不确定的目标:The fastest/best Big O solution producing the shortest interim steps after one week will win.由于不能保证,最快的解决方案是同时使用最少步骤的解决方案,因此,如果一种解决方案使用的步骤较少,但后来却达到了目标,则应该提供一个优先选择。
用户未知,

2
我只是想确认BATCAT将具有零个步骤,对不对?
st0le 2011年

Answers:


9

由于长度被列为标准,因此这是高尔夫球版的1681个字符(可能仍可以提高10%):

import java.io.*;import java.util.*;public class W{public static void main(String[]
a)throws Exception{int n=a.length<1?5:a[0].length(),p,q;String f,t,l;S w=new S();Scanner
s=new Scanner(new
File("sowpods"));while(s.hasNext()){f=s.next();if(f.length()==n)w.add(f);}if(a.length<1){String[]x=w.toArray(new
String[0]);Random
r=new Random();q=x.length;p=r.nextInt(q);q=r.nextInt(q-1);f=x[p];t=x[p>q?q:q+1];}else{f=a[0];t=a[1];}H<S>
A=new H(),B=new H(),C=new H();for(String W:w){A.put(W,new
S());for(p=0;p<n;p++){char[]c=W.toCharArray();c[p]='.';l=new
String(c);A.get(W).add(l);S z=B.get(l);if(z==null)B.put(l,z=new
S());z.add(W);}}for(String W:A.keySet()){C.put(W,w=new S());for(String
L:A.get(W))for(String b:B.get(L))if(b!=W)w.add(b);}N m,o,ñ;H<N> N=new H();N.put(f,m=new
N(f,t));N.put(t,o=new N(t,t));m.k=0;N[]H=new
N[3];H[0]=m;p=H[0].h;while(0<1){if(H[0]==null){if(H[1]==H[2])break;H[0]=H[1];H[1]=H[2];H[2]=null;p++;continue;}if(p>=o.k-1)break;m=H[0];H[0]=m.x();if(H[0]==m)H[0]=null;for(String
v:C.get(m.s)){ñ=N.get(v);if(ñ==null)N.put(v,ñ=new N(v,t));if(m.k+1<ñ.k){if(ñ.k<ñ.I){q=ñ.k+ñ.h-p;N
Ñ=ñ.x();if(H[q]==ñ)H[q]=Ñ==ñ?null:Ñ;}ñ.b=m;ñ.k=m.k+1;q=ñ.k+ñ.h-p;if(H[q]==null)H[q]=ñ;else{ñ.n=H[q];ñ.p=ñ.n.p;ñ.n.p=ñ.p.n=ñ;}}}}if(o.b==null)System.out.println(f+"\n"+t+"\nOY");else{String[]P=new
String[o.k+2];P[o.k+1]=o.k-1+"";m=o;for(q=m.k;q>=0;q--){P[q]=m.s;m=m.b;}for(String
W:P)System.out.println(W);}}}class N{String s;int k,h,I=(1<<30)-1;N b,p,n;N(String S,String
d){s=S;for(k=0;k<d.length();k++)if(d.charAt(k)!=S.charAt(k))h++;k=I;p=n=this;}N
x(){N r=n;n.p=p;p.n=n;n=p=this;return r;}}class S extends HashSet<String>{}class H<V>extends
HashMap<String,V>{}

不使用版本的版本是:使用软件包名称和方法,并且不给出警告或扩展类以使其别名。

package com.akshor.pjt33;

import java.io.*;
import java.util.*;

// WordLadder partially golfed and with reduced dependencies
//
// Variables used in complexity analysis:
// n is the word length
// V is the number of words (vertex count of the graph)
// E is the number of edges
// hash is the cost of a hash insert / lookup - I will assume it's constant, but without completely brushing it under the carpet
public class WordLadder2
{
    private Map<String, Set<String>> wordsToWords = new HashMap<String, Set<String>>();

    // Initialisation cost: O(V * n * (n + hash) + E * hash)
    private WordLadder2(Set<String> words)
    {
        Map<String, Set<String>> wordsToLinks = new HashMap<String, Set<String>>();
        Map<String, Set<String>> linksToWords = new HashMap<String, Set<String>>();

        // Cost: O(Vn * (n + hash))
        for (String word : words)
        {
            // Cost: O(n*(n + hash))
            for (int i = 0; i < word.length(); i++)
            {
                // Cost: O(n + hash)
                char[] ch = word.toCharArray();
                ch[i] = '.';
                String link = new String(ch).intern();
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
            }
        }

        // Cost: O(V * n * hash + E * hash)
        for (Map.Entry<String, Set<String>> from : wordsToLinks.entrySet()) {
            String src = from.getKey();
            wordsToWords.put(src, new HashSet<String>());
            for (String link : from.getValue()) {
                Set<String> to = linksToWords.get(link);
                for (String snk : to) {
                    // Note: equality test is safe here. Cost is O(hash)
                    if (snk != src) add(wordsToWords, src, snk);
                }
            }
        }
    }

    public static void main(String[] args) throws IOException
    {
        // Cost: O(filelength + num_words * hash)
        Map<Integer, Set<String>> wordsByLength = new HashMap<Integer, Set<String>>();
        BufferedReader br = new BufferedReader(new FileReader("sowpods"), 8192);
        String line;
        while ((line = br.readLine()) != null) add(wordsByLength, line.length(), line);

        if (args.length == 2) {
            String from = args[0].toUpperCase();
            String to = args[1].toUpperCase();
            new WordLadder2(wordsByLength.get(from.length())).findPath(from, to);
        }
        else {
            // 5-letter words are the most interesting.
            String[] _5 = wordsByLength.get(5).toArray(new String[0]);
            Random rnd = new Random();
            int f = rnd.nextInt(_5.length), g = rnd.nextInt(_5.length - 1);
            if (g >= f) g++;
            new WordLadder2(wordsByLength.get(5)).findPath(_5[f], _5[g]);
        }
    }

    // O(E * hash)
    private void findPath(String start, String dest) {
        Node startNode = new Node(start, dest);
        startNode.cost = 0; startNode.backpointer = startNode;

        Node endNode = new Node(dest, dest);

        // Node lookup
        Map<String, Node> nodes = new HashMap<String, Node>();
        nodes.put(start, startNode);
        nodes.put(dest, endNode);

        // Heap
        Node[] heap = new Node[3];
        heap[0] = startNode;
        int base = heap[0].heuristic;

        // O(E * hash)
        while (true) {
            if (heap[0] == null) {
                if (heap[1] == heap[2]) break;
                heap[0] = heap[1]; heap[1] = heap[2]; heap[2] = null; base++;
                continue;
            }

            // If the lowest cost isn't at least 1 less than the current cost for the destination,
            // it can't improve the best path to the destination.
            if (base >= endNode.cost - 1) break;

            // Get the cheapest node from the heap.
            Node v0 = heap[0];
            heap[0] = v0.remove();
            if (heap[0] == v0) heap[0] = null;

            // Relax the edges from v0.
            int g_v0 = v0.cost;
            // O(hash * #neighbours)
            for (String v1Str : wordsToWords.get(v0.key))
            {
                Node v1 = nodes.get(v1Str);
                if (v1 == null) {
                    v1 = new Node(v1Str, dest);
                    nodes.put(v1Str, v1);
                }

                // If it's an improvement, use it.
                if (g_v0 + 1 < v1.cost)
                {
                    // Update the heap.
                    if (v1.cost < Node.INFINITY)
                    {
                        int bucket = v1.cost + v1.heuristic - base;
                        Node t = v1.remove();
                        if (heap[bucket] == v1) heap[bucket] = t == v1 ? null : t;
                    }

                    // Next update the backpointer and the costs map.
                    v1.backpointer = v0;
                    v1.cost = g_v0 + 1;

                    int bucket = v1.cost + v1.heuristic - base;
                    if (heap[bucket] == null) {
                        heap[bucket] = v1;
                    }
                    else {
                        v1.next = heap[bucket];
                        v1.prev = v1.next.prev;
                        v1.next.prev = v1.prev.next = v1;
                    }
                }
            }
        }

        if (endNode.backpointer == null) {
            System.out.println(start);
            System.out.println(dest);
            System.out.println("OY");
        }
        else {
            String[] path = new String[endNode.cost + 1];
            Node t = endNode;
            for (int i = t.cost; i >= 0; i--) {
                path[i] = t.key;
                t = t.backpointer;
            }
            for (String str : path) System.out.println(str);
            System.out.println(path.length - 2);
        }
    }

    private static <K, V> void add(Map<K, Set<V>> map, K key, V value) {
        Set<V> vals = map.get(key);
        if (vals == null) map.put(key, vals = new HashSet<V>());
        vals.add(value);
    }

    private static class Node
    {
        public static int INFINITY = Integer.MAX_VALUE >> 1;

        public String key;
        public int cost;
        public int heuristic;
        public Node backpointer;

        public Node prev = this;
        public Node next = this;

        public Node(String key, String dest) {
            this.key = key;
            cost = INFINITY;
            for (int i = 0; i < dest.length(); i++) if (dest.charAt(i) != key.charAt(i)) heuristic++;
        }

        public Node remove() {
            Node rv = next;
            next.prev = prev;
            prev.next = next;
            next = prev = this;
            return rv;
        }
    }
}

如您所见,运行成本分析为O(filelength + num_words * hash + V * n * (n + hash) + E * hash)。如果您接受我的假设,即哈希表插入/查找是恒定时间,那就是O(filelength + V n^2 + E)。SOWPODS中图表的特殊统计数据意味着大多数情况下O(V n^2)实际上都是占主导地位O(E)n

样本输出:

IDOLA,IDOLS,IDYLS,ODYLS,ODALS,OVALS,OVELS,OVENS,EVENS,ETENS,STENS,SKENS,SKINS,SPINS,SPINE,13

WICCA,PROSY,OY

布林妮,BRINS,TRINS,TAINS,TARNS,YARNS,YAWNS,YAWPS,YAPPS,7

GALES,GASE,GASTS,GEST,GESTE,GESSE,DESSE,5

SURES,DURES,沙丘,种类,丁类,橡皮泥,4

轻,轻,高低,大,大,大,小鸟,女孩,女孩,手枪,关西,关纳,鲁阿纳,10

SARGE,SERGE,SERRE,SERRS,SEERS,DEERS,DYERS,OYERS,OVERS,OVELS,OVALS,ODALS,ODYLS,IDYLS,12

串,串,串,啤酒,啤酒,布雷,布雷,奶油,绉布,7

这是最长路径最短的6对中的一对:

最高,最便宜,最晚,最晚,最晚,最晚,最晚,最晚,最早,最慢,最晚,最晚,最晚,最晚,最晚,最晚,最晚,最晚,最晚,最晚,最慢,最慢袋,小袋,小袋,小袋鼠,慕斯,慕斯,慕斯,布袋,加号,小圆饼,小袋,压片,小袋,尿素,尿布,未装袋,未装箱,未装箱,未装箱,未装箱,未装满,未入网,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已结,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,已贴,封号,封号,封号,标记,结尾,索引,入侵,入侵,入侵,感染,感染,感染,56

和最坏情况的可溶8字母对之一:

涂胶,涂胶,展开,展开,脱帽,松开,缠结,卷入,卷入,涂装,涂胶,松开,放样,打球,撒布,喷涂,散布,撒布,涂抹,涂抹,起重,起重,起重压接,压接,剪接,接缝,接缝,接缝,爬接器,夹板,夹板,夹板,夹板,便鞋,便鞋,钩针,粘手,平滑鞋,安抚奶嘴,南方人,粘嘴,粘鞋,粘鞋,教练,偷猎者,荷包,荷包,荷包午餐,林奇斯,林奇斯,林奇斯,52岁

现在,我认为我已经解决了问题的所有要求,即我的讨论。

对于CompSci,问题显然减少到图G中的最短路径,图G的顶点是单词,并且其边缘连接单词不同的单词。有效地生成图形并非易事-实际上,我需要重新考虑一下以降低O(V n hash + E)的复杂度。我这样做的方式涉及创建一个图,该图插入额外的顶点(对应于带有一个通配符的单词),并且与所讨论的图同胚。我确实考虑过使用该图,而不是减少到G-我认为应该从打高尔夫球的角度出发-基于具有3个以上边的通配符节点会减少图中的边数,并且最短路径算法的标准最坏情况运行时间为O(V heap-op + E)

但是,我要做的第一件事是对不同单词长度的图形G进行一些分析,我发现它们对于5个或更多字母的单词非常稀疏。5个字母的图具有12478个顶点和40759个边;添加链接节点会使图形更糟。到您使用8个字母时,边的数量少于节点的数量,并且3/7个单词是“独白”。因此,我拒绝了这种优化想法,因为它没有太大帮助。

确实有用的想法是检查堆。老实说,过去我已经实现了一些适度的奇异堆,但没有一个像这样奇特。我使用A-star(因为给定我正在使用的堆,C不能提供任何好处)具有明显不同于目标的字母数启发法,并且一点分析表明,在任何时候,优先级都不超过3个在堆里。当我弹出一个优先级为(成本+启发式)的节点并查看其邻居时,我在考虑三种情况:1)邻居的成本为cost + 1;邻居的启发式方法是heuristic-1(因为它更改的字母变为“正确”);2)cost + 1和heuristic + 0(因为它更改的字母从“错误”变为“仍然错误”;3)cost + 1和heuristic + 1(因为它更改的字母从“正确”变为“错误”)。因此,如果我放松邻居,我将以相同的优先级,优先级+1或优先级+2插入它。结果,我可以为堆使用链接列表的3元素数组。

我应该添加一个关于我的假设,即哈希查找是恒定的。您可能会说很好,但是哈希计算又如何呢?答案是我要摊销它们:java.lang.String缓存其hashCode(),因此计算散列所花费的总时间为O(V n^2)(生成图形时)。

还有另一个影响复杂性的变化,但是是否是优化的问题取决于您对统计的假设。(IMO将“最佳Big O解决方案”作为标准是错误的,因为没有最佳复杂性,原因很简单:没有一个变量)。此更改会影响图形生成步骤。在上面的代码中,它是:

        Map<String, Set<String>> wordsToLinks = new HashMap<String, Set<String>>();
        Map<String, Set<String>> linksToWords = new HashMap<String, Set<String>>();

        // Cost: O(Vn * (n + hash))
        for (String word : words)
        {
            // Cost: O(n*(n + hash))
            for (int i = 0; i < word.length(); i++)
            {
                // Cost: O(n + hash)
                char[] ch = word.toCharArray();
                ch[i] = '.';
                String link = new String(ch).intern();
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
            }
        }

        // Cost: O(V * n * hash + E * hash)
        for (Map.Entry<String, Set<String>> from : wordsToLinks.entrySet()) {
            String src = from.getKey();
            wordsToWords.put(src, new HashSet<String>());
            for (String link : from.getValue()) {
                Set<String> to = linksToWords.get(link);
                for (String snk : to) {
                    // Note: equality test is safe here. Cost is O(hash)
                    if (snk != src) add(wordsToWords, src, snk);
                }
            }
        }

那是O(V * n * (n + hash) + E * hash)。但是O(V * n^2)部分来自为每个链接生成一个新的n字符字符串,然后计算其哈希码。可以通过帮助程序类避免这种情况:

    private static class Link
    {
        private String str;
        private int hash;
        private int missingIdx;

        public Link(String str, int hash, int missingIdx) {
            this.str = str;
            this.hash = hash;
            this.missingIdx = missingIdx;
        }

        @Override
        public int hashCode() { return hash; }

        @Override
        public boolean equals(Object obj) {
            Link l = (Link)obj; // Unsafe, but I know the contexts where I'm using this class...
            if (this == l) return true; // Essential
            if (hash != l.hash || missingIdx != l.missingIdx) return false;
            for (int i = 0; i < str.length(); i++) {
                if (i != missingIdx && str.charAt(i) != l.str.charAt(i)) return false;
            }
            return true;
        }
    }

然后,图生成的前半部分变为

        Map<String, Set<Link>> wordsToLinks = new HashMap<String, Set<Link>>();
        Map<Link, Set<String>> linksToWords = new HashMap<Link, Set<String>>();

        // Cost: O(V * n * hash)
        for (String word : words)
        {
            // apidoc: The hash code for a String object is computed as
            // s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
            // Cost: O(n * hash)
            int hashCode = word.hashCode();
            int pow = 1;
            for (int j = word.length() - 1; j >= 0; j--) {
                Link link = new Link(word, hashCode - word.charAt(j) * pow, j);
                add(wordsToLinks, word, link);
                add(linksToWords, link, word);
                pow *= 31;
            }
        }

通过使用哈希码的结构,我们可以在中生成链接O(V * n)。但是,这具有连锁效应。我的假设是哈希查找是恒定时间,这是假设比较对象是否相等便宜。但是,Link的相等性测试是O(n)最坏的情况。最坏的情况是当我们从不同的单词生成的两个相等的链接之间发生哈希冲突时-即它发生O(E)在图生成的后半部分。除此之外,除非在非相等链接之间发生哈希冲突的可能性很小,否则我们很好。所以我们用O(V * n^2)换了O(E * n * hash)。请参阅我先前关于统计的观点。


我相信8192是BufferedReader的默认缓冲区大小(在SunVM上)
2011年

@ st0le,我在高尔夫球版本中省略了该参数,并且在没有高尔夫球的版本中也无害。
彼得·泰勒

5

爪哇

复杂度:(我没有CompSci学位,所以我很乐意为您提供帮助。)

输入:在命令行中提供单词对(如果需要,可以多于一对)。如果未指定命令行,则选择2个不同的随机词。

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Random;
import java.util.Set;

public class M {

    // for memoization
    private static Map<String, List<String>> memoEdits = new HashMap<String, List<String>>(); 
    private static Set<String> dict;

    private static List<String> edits(String word, Set<String> dict) {
        if(memoEdits.containsKey(word))
            return memoEdits.get(word);

        List<String> editsList = new LinkedList<String>();
        char[] letters = word.toCharArray();
        for(int i = 0; i < letters.length; i++) {
            char hold = letters[i];
            for(char ch = 'A'; ch <= 'Z'; ch++) {
                if(ch != hold) {
                    letters[i] = ch;
                    String nWord = new String(letters);
                    if(dict.contains(nWord)) {
                        editsList.add(nWord);
                    }
                }
            }
            letters[i] = hold;
        }
        memoEdits.put(word, editsList);
        return editsList;
    }

    private static Map<String, String> bfs(String wordFrom, String wordTo,
                                           Set<String> dict) {
        Set<String> visited = new HashSet<String>();
        List<String> queue = new LinkedList<String>();
        Map<String, String> pred = new HashMap<String, String>();
        queue.add(wordFrom);
        while(!queue.isEmpty()) {
            String word = queue.remove(0);
            if(word.equals(wordTo))
                break;

            for(String nWord: edits(word, dict)) {
                if(!visited.contains(nWord)) {
                    queue.add(nWord);
                    visited.add(nWord);
                    pred.put(nWord, word);
                }
            }
        }
        return pred;
    }

    public static void printPath(String wordTo, String wordFrom) {
        int c = 0;
        Map<String, String> pred = bfs(wordFrom, wordTo, dict);
        do {
            System.out.println(wordTo);
            c++;
            wordTo = pred.get(wordTo);
        }
        while(wordTo != null && !wordFrom.equals(wordTo));
        System.out.println(wordFrom);
        if(wordTo != null)
            System.out.println(c - 1);
        else
            System.out.println("OY");
        System.out.println();
    }

    public static void main(String[] args) throws Exception {
        BufferedReader scan = new BufferedReader(new FileReader(new File("c:\\332609\\dict.txt")),
                                                 40 * 1024);
        String line;
        dict = new HashSet<String>(); //the dictionary (1 word per line)
        while((line = scan.readLine()) != null) {
            dict.add(line);
        }
        scan.close();
        if(args.length == 0) { // No Command line Arguments? Pick 2 random
                               // words.
            Random r = new Random(System.currentTimeMillis());
            String[] words = dict.toArray(new String[dict.size()]);
            int x = r.nextInt(words.length), y = r.nextInt(words.length);
            while(x == y) //same word? that's not fun...
                y = r.nextInt(words.length);
            printPath(words[x], words[y]);
        }
        else { // Arguments provided, search for path pairwise
            for(int i = 0; i < args.length; i += 2) {
                if(i + 1 < args.length)
                    printPath(args[i], args[i + 1]);
            }
        }
    }
}

我使用了“记忆化”功能,以获得更快的结果。字典路径是硬编码的。
st0le 2011年

@Joey,过去是,现在不再是。现在,它具有一个静态字段,该字段每次都会递增并添加到中System.nanoTime()
彼得·泰勒,

@Joey,啊,好的,但是我现在就离开,不想增加我的修订版本:P
st0le 2011年

哦,顺便说一句,我在上班,那些拼字游戏的网站显然被封锁了,所以我无法访问字典...到明天早上最好会产生那10个独特的单词。干杯!
st0le 2011年

您可以通过执行双向bfs来降低(计算)复杂度,即从两侧搜索并在遇到从另一侧访问的节点时停止。
2011年

3

UNIX上的C

使用dijkstra算法。

该代码的很大一部分是服装n元树实现,用于保留

  • 假设文件IO较慢,单词表(因此将输入文件的读取次数减至最少(两次为无参数,一次为其他情况))
  • 我们构建它们时的局部树。
  • 最终路径。

任何人都希望看到如何它的工作原理或许应该阅读findPathprocess以及processOne(及其相关的评论)。也许buildPath还有buildPartialPath。其余的是簿记和脚手架。在测试和开发过程中使用了一些例程,但在“生产”版本中没有使用这些例程。

我用/usr/share/dict/words我的Mac OS 10.5盒子,里面有这么多长,深奥的条目让它完全随机运行产生很大OY秒。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <getline.h>
#include <time.h>
#include <unistd.h>
#include <ctype.h>

const char*wordfile="/usr/share/dict/words";
/* const char*wordfile="./testwords.txt"; */
const long double RANDOM_MAX = (2LL<<31)-1;

typedef struct node_t {
  char*word;
  struct node_t*kids;
  struct node_t*next;
} node;


/* Return a pointer to a newly allocated node. If word is non-NULL, 
 * call setWordNode;
 */
node*newNode(char*word){
  node*n=malloc(sizeof(node));
  n->word=NULL;
  n->kids=NULL;
  n->next=NULL;
  if (word) n->word = strdup(word);
  return n;
}
/* We can use the "next" links to treat these as a simple linked list,
 * and further can make it a stack or queue by
 *
 * * pop()/deQueu() from the head
 * * push() onto the head
 * * enQueue at the back
 */
void push(node*n, node**list){
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  n->next = (*list);
  (*list) = n;
}
void enQueue(node*n, node**list){
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  if ( *list==NULL ) {
    *list=n;
  } else {
    enQueue(n,&((*list)->next));
  }
}
node*pop(node**list){
  node*temp=NULL;
  if (list==NULL){
    fprintf(stderr,"Active operation on a NULL list! Exiting\n");
    exit(5);
  }
  temp = *list;
  if (temp != NULL) {
    (*list) = temp->next;
    temp->next=NULL;
  }
  return temp;
}
node*deQueue(node**list){ /* Alias for pop */
  return pop(list);
}

/* return a pointer to a node in tree matching word or NULL if none */
node* isInTree(char*word, node*tree){
  node*isInNext=NULL;
  node*isInKids=NULL;
  if (tree==NULL || word==NULL) return NULL;
  if (tree->word && (0 == strcasecmp(word,tree->word))) return tree;
  /* prefer to find the target at shallow levels so check the siblings
     before the kids */
  if (tree->next && (isInNext=isInTree(word,tree->next))) return isInNext;
  if (tree->kids && (isInKids=isInTree(word,tree->kids))) return isInKids;
  return NULL;
}

node* freeTree(node*t){
  if (t==NULL) return NULL;
  if (t->word) {free(t->word); t->word=NULL;}
  if (t->next) t->next=freeTree(t->next);
  if (t->kids) t->kids=freeTree(t->kids);
  free(t);
  return NULL;
}

void printTree(node*t, int indent){
  int i;
  if (t==NULL) return;
  for (i=0; i<indent; i++) printf("\t"); printf("%s\n",t->word);
  printTree(t->kids,indent+1);
  printTree(t->next,indent);
}

/* count the letters of difference between two strings */
int countDiff(const char*w1, const char*w2){
  int count=0;
  if (w1==NULL || w2==NULL) return -1;
  while ( (*w1)!='\0' && (*w2)!='\0' ) {
    if ( (*w1)!=(*w2) ) count++;
    w1++;
    w2++;
  }
  return count;
}

node*buildPartialPath(char*stop, node*tree){
  node*list=NULL;
  while ( (tree != NULL) && 
      (tree->word != NULL) && 
      (0 != strcasecmp(tree->word,stop)) ) {
    node*kid=tree->kids;
    node*newN = newNode(tree->word);
    push(newN,&list);
    newN=NULL;
    /* walk over all all kids not leading to stop */
    while ( kid && 
        (strcasecmp(kid->word,stop)!=0) &&
        !isInTree(stop,kid->kids) ) {
      kid=kid->next;
    }
    if (kid==NULL) {
      /* Assuming a preconditions where isInTree(stop,tree), we should
       * not be able to get here...
       */
      fprintf(stderr,"Unpossible!\n");
      exit(7);
    } 
    /* Here we've found a node that either *is* the target or leads to it */
    if (strcasecmp(stop,kid->word) == 0) {
      break;
    }
    tree = kid;
  }
  return list; 
}
/* build a node list path 
 *
 * We can walk down each tree, identfying nodes as we go
 */
node*buildPath(char*pivot,node*frontTree,node*backTree){
  node*front=buildPartialPath(pivot,frontTree);
  node*back=buildPartialPath(pivot,backTree);
  /* weld them together with pivot in between 
  *
  * The front list is in reverse order, the back list in order
  */
  node*thePath=NULL;
  while (front != NULL) {
    node*n=pop(&front);
    push(n,&thePath);
  }
  if (pivot != NULL) {
    node*n=newNode(pivot);
    enQueue(n,&thePath);
  }
  while (back != NULL) {
    node*n=pop(&back);
    enQueue(n,&thePath);
  }
  return thePath;
}

/* Add new child nodes to the single node in ts named by word. Also
 * queue these new word in q
 * 
 * Find node N matching word in ts
 * For tword in wordList
 *    if (tword is one change from word) AND (tword not in ts)
 *        add tword to N.kids
 *        add tword to q
 *        if tword in to
 *           return tword
 * return NULL
 */
char* processOne(char *word, node**q, node**ts, node**to, node*wordList){
  if ( word==NULL || q==NULL || ts==NULL || to==NULL || wordList==NULL ) {
    fprintf(stderr,"ProcessOne called with NULL argument! Exiting.\n");
    exit(9);
  }
  char*result=NULL;
  /* There should be a node in ts matching the leading node of q, find it */
  node*here = isInTree(word,*ts);
  /* Now test each word in the list as a possible child of HERE */
  while (wordList != NULL) {
    char *tword=wordList->word;
    if ((1==countDiff(word,tword)) && !isInTree(tword,*ts)) {
      /* Queue this up as a child AND for further processing */
      node*newN=newNode(tword);
      enQueue(newN,&(here->kids));
      newN=newNode(tword);
      enQueue(newN,q);
      /* This might be our pivot */
      if ( isInTree(tword,*to) ) {
    /* we have found a node that is in both trees */
    result=strdup(tword);
    return result;
      }
    }
    wordList=wordList->next;
  }
  return result;
}

/* Add new child nodes to ts for all the words in q */
char* process(node**q, node**ts, node**to, node*wordList){
  node*tq=NULL;
  char*pivot=NULL;
  if ( q==NULL || ts==NULL || to==NULL || wordList==NULL ) {
    fprintf(stderr,"Process called with NULL argument! Exiting.\n");
    exit(9);
  }
  while (*q && (pivot=processOne((*q)->word,&tq,ts,to,wordList))==NULL) {
    freeTree(deQueue(q));
  }
  freeTree(*q); 
  *q=tq;
  return pivot;
}

/* Find a path between w1 and w2 using wordList by dijkstra's
 * algorithm
 *
 * Use a breadth-first extensions of the trees alternating between
 * trees.
 */
node* findPath(char*w1, char*w2, node*wordList){
  node*thePath=NULL; /* our resulting path */
  char*pivot=NULL; /* The node we find that matches */
  /* trees of existing nodes */
  node*t1=newNode(w1); 
  node*t2=newNode(w2);
  /* queues of nodes to work on */
  node*q1=newNode(w1);
  node*q2=newNode(w2);

  /* work each queue all the way through alternating until a word is
     found in both lists */
  while( (q1!=NULL) && ((pivot = process(&q1,&t1,&t2,wordList)) == NULL) &&
     (q2!=NULL) && ((pivot = process(&q2,&t2,&t1,wordList)) == NULL) )
    /* no loop body */ ;


  /* one way or another we are done with the queues here */
  q1=freeTree(q1);
  q2=freeTree(q2);
  /* now construct the path */
  if (pivot!=NULL) thePath=buildPath(pivot,t1,t2);
  /* clean up after ourselves */
  t1=freeTree(t1);
  t2=freeTree(t2);

  return thePath;
}

/* Convert a non-const string to UPPERCASE in place */
void upcase(char *s){
  while (s && *s) {
    *s = toupper(*s);
    s++;
  }
}

/* Walks the input file stuffing lines of the given length into a list */
node*getListWithLength(const char*fname, int len){
  int l=-1;
  size_t n=0;
  node*list=NULL;
  char *line=NULL;
  /* open the word file */
  FILE*f = fopen(fname,"r");
  if (NULL==f){
    fprintf(stderr,"Could not open word file '%s'. Exiting.\n",fname);
    exit(3);
  }
  /* walk the file, trying each word in turn */
  while ( !feof(f) && ((l = getline(&line,&n,f)) != -1) ) {
    /* strip trailing whitespace */
    char*temp=line;
    strsep(&temp," \t\n");
    if (strlen(line) == len) {
      node*newN = newNode(line);
      upcase(newN->word);
      push(newN,&list);
    }
  }
  fclose(f);
  return list;
}

/* Assumes that filename points to a file containing exactly one
 * word per line with no other whitespace.
 * It will return a randomly selected word from filename.
 *
 * If veto is non-NULL, only non-matching words of the same length
 * wll be considered.
 */
char*getRandomWordFile(const char*fname, const char*veto){
  int l=-1, count=1;
  size_t n=0;
  char *word=NULL;
  char *line=NULL;
  /* open the word file */
  FILE*f = fopen(fname,"r");
  if (NULL==f){
    fprintf(stderr,"Could not open word file '%s'. Exiting.\n",fname);
    exit(3);
  }
  /* walk the file, trying each word in turn */
  while ( !feof(f) && ((l = getline(&line,&n,f)) != -1) ) {
    /* strip trailing whitespace */
    char*temp=line;
    strsep(&temp," \t\n");
    if (strlen(line) < 2) continue; /* Single letters are too easy! */
    if ( (veto==NULL) || /* no veto means chose from all */ 
     ( 
      ( strlen(line) == strlen(veto) )  && /* veto means match length */
      ( 0 != strcasecmp(veto,line) )       /* but don't match word */ 
       ) ) { 
      /* This word is worthy of consideration. Select it with random
         chance (1/count) then increment count */
      if ( (word==NULL) || (random() < RANDOM_MAX/count) ) {
    if (word) free(word);
    word=strdup(line);
      }
      count++;
    }
  }
  fclose(f);
  upcase(word);
  return word;
}

void usage(int argc, char**argv){
  fprintf(stderr,"%s [ <startWord> [ <endWord> ]]:\n\n",argv[0]);
  fprintf(stderr,
      "\tFind the shortest transformation from one word to another\n");
  fprintf(stderr,
      "\tchanging only one letter at a time and always maintaining a\n");
  fprintf(stderr,
      "\tword that exists in the word file.\n\n");
  fprintf(stderr,
      "\tIf startWord is not passed, chose at random from '%s'\n",
      wordfile);
  fprintf(stderr,
      "\tIf endWord is not passed, chose at random from '%s'\n",
      wordfile);
  fprintf(stderr,
      "\tconsistent with the length of startWord\n");
  exit(2);
}

int main(int argc, char**argv){
  char *startWord=NULL;
  char *endWord=NULL;

  /* intialize OS services */
  srandom(time(0)+getpid());
  /* process command line */
  switch (argc) {
  case 3:
    endWord = strdup(argv[2]);
    upcase(endWord);
  case 2:
    startWord = strdup(argv[1]);
    upcase(startWord);
  case 1:
    if (NULL==startWord) startWord = getRandomWordFile(wordfile,NULL);
    if (NULL==endWord)   endWord   = getRandomWordFile(wordfile,startWord);
    break;
  default:
    usage(argc,argv);
    break;
  }
  /* need to check this in case the user screwed up */
  if ( !startWord || ! endWord || strlen(startWord) != strlen(endWord) ) {
    fprintf(stderr,"Words '%s' and '%s' are not the same length! Exiting\n",
        startWord,endWord);
    exit(1);
  }
  /* Get a list of all the words having the right length */
  node*wordList=getListWithLength(wordfile,strlen(startWord));
  /* Launch into the path finder*/
  node *theList=findPath(startWord,endWord,wordList);
  /* Print the resulting path */
  if (theList) {
    int count=-2;
    while (theList) {
      printf("%s\n",theList->word);
      theList=theList->next;
      count++;
    }
    printf("%d\n",count);
  } else {
    /* No path found case */
    printf("%s %s OY\n",startWord,endWord);
  }
  return 0;
}

一些输出:

$ ./changeword dive name
DIVE
DIME
DAME
NAME
2
$ ./changeword house gorge
HOUSE
HORSE
GORSE
GORGE
2
$ ./changeword stop read
STOP
STEP
SEEP
SEED
REED
READ
4
$ ./changeword peace slate
PEACE
PLACE
PLATE
SLATE
2
$ ./changeword pole fast  
POLE
POSE
POST
PAST
FAST
3
$ ./changeword          
QUINTIPED LINEARITY OY
$ ./changeword sneaky   
SNEAKY WAXILY OY
$ ./changeword TRICKY
TRICKY
PRICKY
PRINKY
PRANKY
TRANKY
TWANKY
SWANKY
SWANNY
SHANNY
SHANTY
SCANTY
SCATTY
SCOTTY
SPOTTY
SPOUTY
STOUTY
STOUTH
STOUSH
SLOUSH
SLOOSH
SWOOSH
19
$ ./changeword router outlet
ROUTER
ROTTER
RUTTER
RUTHER
OUTHER
OUTLER
OUTLET
5
$ ./changeword 
IDIOM
IDISM
IDIST
ODIST
OVIST
OVEST
OVERT
AVERT
APERT
APART
SPART
SPARY
SEARY
DEARY
DECRY
DECAY
DECAN
DEDAN
SEDAN
17

复杂性分析是不平凡的。搜索是双向的,加深的。

  • 对于检查的每个节点,我都会遍历整个单词列表(尽管仅限于长度合适的单词)。呼叫清单的长度W
  • 最小步骤数是S_min = (<number of different letter>-1)因为如果我们仅相隔一个字母,我们将在0个中间步骤处对更改进行评分。最大值很难量化,请参见上面的TRICKY-SWOOSH。树的每个半会S/2-1S/2
  • 我没有对树的分支行为进行分析,而是将其称为B

因此,最少的操作次数是 2 * (S/2)^B * W不正确的。


也许这对我来说太幼稚了,但是在您的设计或实现中我看不到任何需要边缘权重的东西。尽管Dijkstra的确适用于非加权图(边缘权重始终为“ 1”),但这里是否不采用简单的广度优先搜索来改善边界O(|V|+|E|)而不是O(|E|+|V| log |V|)
MrGomez
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.