Levenshtein距离


40

尽管有很多编辑距离问题,例如这一问题,但编写一个计算Levenshtein距离的程序并不是一个简单的问题。

一些博览会

两个字符串之间的Levenshtein编辑距离是将一个单词转换为另一个单词的最小可能插入,删除或替换次数。在这种情况下,每次插入,删除和替换的成本均为1。

例如,之间的距离roll,并rolling为3,因为缺失花费1,我们需要删除3个characterrs。toll和之间的距离tall为1,因为替换成本为1。

规则

  • 输入将是两个字符串。您可以假设字符串是小写字母,仅包含字母,非空并且最大长度为100个字符。
  • 如上定义,输出将是两个字符串的最小Levenshtein编辑距离。
  • 您的代码必须是程序或函数。它不必是命名函数,但不能是直接计算Levenshtein距离的内置函数。允许使用其他内置插件。
  • 这是代码高尔夫,所以最短的答案会获胜。

一些例子

>>> lev("atoll", "bowl")
3
>>> lev("tar", "tarp")
1
>>> lev("turing", "tarpit")
4
>>> lev("antidisestablishmentarianism", "bulb")
27

与往常一样,如果问题仍然不清楚,请告诉我。祝你好运,打高尔夫球!

目录

var QUESTION_ID=67474;var ANSWER_FILTER="!t)IWYnsLAZle2tQ3KqrVveCRJfxcRLe";var COMMENT_FILTER="!)Q2B_A2kjfAiU78X(md6BoYk";var OVERRIDE_USER=47581;var answers=[],answers_hash,answer_ids,answer_page=1,more_answers=true,comment_page;function answersUrl(index){return"http://api.stackexchange.com/2.2/questions/"+QUESTION_ID+"/answers?page="+index+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+ANSWER_FILTER}function commentUrl(index,answers){return"http://api.stackexchange.com/2.2/answers/"+answers.join(';')+"/comments?page="+index+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+COMMENT_FILTER}function getAnswers(){jQuery.ajax({url:answersUrl(answer_page++),method:"get",dataType:"jsonp",crossDomain:true,success:function(data){answers.push.apply(answers,data.items);answers_hash=[];answer_ids=[];data.items.forEach(function(a){a.comments=[];var id=+a.share_link.match(/\d+/);answer_ids.push(id);answers_hash[id]=a});if(!data.has_more)more_answers=false;comment_page=1;getComments()}})}function getComments(){jQuery.ajax({url:commentUrl(comment_page++,answer_ids),method:"get",dataType:"jsonp",crossDomain:true,success:function(data){data.items.forEach(function(c){if(c.owner.user_id===OVERRIDE_USER)answers_hash[c.post_id].comments.push(c)});if(data.has_more)getComments();else if(more_answers)getAnswers();else process()}})}getAnswers();var SCORE_REG=/<h\d>\s*([^\n,<]*(?:<(?:[^\n>]*>[^\n<]*<\/[^\n>]*>)[^\n,<]*)*),.*?(\d+)(?=[^\n\d<>]*(?:<(?:s>[^\n<>]*<\/s>|[^\n<>]+>)[^\n\d<>]*)*<\/h\d>)/;var OVERRIDE_REG=/^Override\s*header:\s*/i;function getAuthorName(a){return a.owner.display_name}function process(){var valid=[];answers.forEach(function(a){var body=a.body;a.comments.forEach(function(c){if(OVERRIDE_REG.test(c.body))body='<h1>'+c.body.replace(OVERRIDE_REG,'')+'</h1>'});var match=body.match(SCORE_REG);if(match)valid.push({user:getAuthorName(a),size:+match[2],language:match[1],link:a.share_link,});else console.log(body)});valid.sort(function(a,b){var aB=a.size,bB=b.size;return aB-bB});var languages={};var place=1;var lastSize=null;var lastPlace=1;valid.forEach(function(a){if(a.size!=lastSize)lastPlace=place;lastSize=a.size;++place;var answer=jQuery("#answer-template").html();answer=answer.replace("{{PLACE}}",lastPlace+".").replace("{{NAME}}",a.user).replace("{{LANGUAGE}}",a.language).replace("{{SIZE}}",a.size).replace("{{LINK}}",a.link);answer=jQuery(answer);jQuery("#answers").append(answer);var lang=a.language;lang=jQuery('<a>'+lang+'</a>').text();languages[lang]=languages[lang]||{lang:a.language,lang_raw:lang.toLowerCase(),user:a.user,size:a.size,link:a.link}});var langs=[];for(var lang in languages)if(languages.hasOwnProperty(lang))langs.push(languages[lang]);langs.sort(function(a,b){if(a.lang_raw>b.lang_raw)return 1;if(a.lang_raw<b.lang_raw)return-1;return 0});for(var i=0;i<langs.length;++i){var language=jQuery("#language-template").html();var lang=langs[i];language=language.replace("{{LANGUAGE}}",lang.lang).replace("{{NAME}}",lang.user).replace("{{SIZE}}",lang.size).replace("{{LINK}}",lang.link);language=jQuery(language);jQuery("#languages").append(language)}}
body{text-align:left!important}#answer-list{padding:10px;width:290px;float:left}#language-list{padding:10px;width:290px;float:left}table thead{font-weight:700}table td{padding:5px}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <link rel="stylesheet" type="text/css" href="//cdn.sstatic.net/codegolf/all.css?v=83c949450c8b"> <div id="language-list"> <h2>Shortest Solution by Language</h2> <table class="language-list"> <thead> <tr><td>Language</td><td>User</td><td>Score</td></tr> </thead> <tbody id="languages"> </tbody> </table> </div> <div id="answer-list"> <h2>Leaderboard</h2> <table class="answer-list"> <thead> <tr><td></td><td>Author</td><td>Language</td><td>Size</td></tr> </thead> <tbody id="answers"> </tbody> </table> </div> <table style="display: none"> <tbody id="answer-template"> <tr><td>{{PLACE}}</td><td>{{NAME}}</td><td>{{LANGUAGE}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr> </tbody> </table> <table style="display: none"> <tbody id="language-template"> <tr><td>{{LANGUAGE}}</td><td>{{NAME}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr> </tbody> </table>

Answers:


8

Pyth,34个字节

J]wf}z=Jsmsm++.DdkXLdkGXLkdGhld-Jk

示范

打高尔夫球不是特别好,而且速度很慢。它不能在合理的时间内处理超过2次更改的任何事情。


3
但这有效,这才是关键。:P
Conor O'Brien

10

Matlab,177 163字节

function l=c(a,b);m=nnz(a)+1;n=nnz(b)+1;for i=0:m-1;for j=0:n-1;z=max(i,j);try;z=min([l(i,j+1)+1,l(i+1,j)+1,l(i,j)+(a(i)~=b(j))]);end;l(i+1,j+1)=z;end;end;l=l(m,n)

这是此公式的直接实现:

在此处输入图片说明

取消高尔夫:

function l=l(a,b);
m=nnz(a)+1;
n=nnz(b)+1;
for i=0:m-1;
    for j=0:n-1;
        z=max(i,j);
        try;
            z=min([l(i,j+1)+1,l(i+1,j)+1,l(i,j)+(a(i)~=b(j))]);
        end;
        l(i+1,j+1)=z;
    end;
end;
l=l(m,n)

如果评分代码不是您所包含的代码,请提供评分代码。否则,我认为有很多空白可以利用。
Alex A.

1
@AlexA。用于缩进的前导空格和换行符不计算在内(可以安全删除)。从前,这是允许的,没有人抱怨。
edc65

1
@ edc65 现在,元共识是应提供评分的代码。
Alex A.

2
那么,大多数人更喜欢不可读的版本。我仍然在这里保留可读版本,以防万一有人想看看实际发生了什么=)
更加糟糕的

2
通常提供高尔夫打标(打分的)和非打高尔夫球的打标,我们只要求包括打分的打标。;)
Alex A.

7

Python 2中,151个 140 138字节

基于Wikipedia的Levenshtein距离的慢速递归实现(感谢@Kenney削了11个字符,而@ Sherlock9削了另外2个字符)。

def l(s,t):
 def f(m,n):
  if m*n<1:return m or n
  return 1+min([f(m-1,n),f(m,n-1),f(m-1,n-1)-(s[m-1]==t[n-1])])
 return f(len(s),len(t))

为给出的测试用例提供正确答案:

assert l("tar", "tarp") == 1
assert l("turing", "tarpit") == 4
assert l("antidisestablishmentarianism", "bulb") == 27        
assert l("atoll", "bowl") == 3

1
通过执行类似的操作if !n*m:return n if n else m,您可以节省3-4个字节左右,而通过进行其他操作则可以节省2-3 个字节return 1+min([ f(..), f(..), f(..) - (s[..] == t[..]) ])
肯尼2015年

使用f(m-1,n-1)-(s[m-1]==t[n-1])代替可以节省2个字节f(m-1,n-1)+(s[m-1]!=t[n-1])-1
Sherlock15年


5

的JavaScript(ES6)106 113 122

根据@Neil建议编辑 16个字节保存

作为匿名函数。

(s,t)=>[...s].map((u,i)=>w=w.map((v,j)=>p=j--?Math.min(p,v,w[j]-(u==t[j]))+1:i+1),w=[...[,...t].keys()])|p

这是Wagner-Fischer算法的一种实现方案,与链接的维基百科文章中的“ 具有两行矩阵的迭代 ”部分中所述完全相同(即使实际上只使用了1行-数组w)。

少打高尔夫球

(s,t)=>
{
  w = [...[0,...t].keys()];
  for(i = 0; i < s.length; i++)
    w = w.map((v,j)=>
              p = j
              ? Math.min(p+1, v+1, w[j-1] + (s[i]!=t[j-1]))
              : i+1
             );
  return p
}

测试片段

L=(s,t)=>[...s].map((u,i)=>w=w.map((v,j)=>p=j--?Math.min(p,v,w[j]-(u==t[j]))+1:i+1),w=[...[,...t].keys()])|p

console.log=x=>O.textContent+=x+'\n';

[["atoll", "bowl"],["tar", "tarp"]
,["turing", "tarpit"],["antidisestablishmentarianism", "bulb"]]
.forEach(t=>console.log(t+' => '+L(...t)))
<pre id=O></pre>


1
您可以[...[0,...t].keys()]改用吗?如果可以,保存2个字节。
尼尔

1
@Neil看起来很丑,但是更短。Thx
edc65

1
实际上,您可以节省另一个字节,[...[,...t].keys()]我认为也可以。
尼尔

我设法使用以下方法剃掉了另一个字节[...s].map()(s,t)=>(w=[...[,...t].keys()],[...s].map((u,i)=>w=w.map((v,j)=>p=j--?Math.min(p,v,w[j]-(s[i-1]==t[j]))+1:i)),p)
Neil

@Neil太好了,再次感谢!
edc65

4

Python 2,118字节

的高尔夫这一解决方案,但它并不像威廉的过了一年,因此我不得不将它张贴自己:

def l(s,t):f=lambda m,n:m or n if m*n<1else-~min(f(m-1,n),f(m,n-1),f(m-1,n-1)-(s[m-1]==t[n-1]));print f(len(s),len(t))

尝试repl.it

接受两个字符串并输出到STDOUTmeta所允许的)的距离。请评论建议,我相信这可以继续进行下去。


是否有必要将所有内容包装在函数中?您可以使用两个input()还是一个input().split()
Sherlock16年

@ Sherlock9我试过了,但据我所知,它额外花费了1个字节
FlipTack

对了,我忘了,你需要定义st代码中的某处。没关系。好工作:D
Sherlock9'9

我不知道为什么威廉会用m or n。您可以将其替换为m+n
Arnauld

3

自动 333字节

Func l($0,$1,$_=StringLen,$z=StringMid)
Dim $2=$_($0),$3=$_($1),$4[$2+1][$3+1]
For $5=0 To $2
$4[$5][0]=$5
Next
For $6=0 To $3
$4[0][$6]=$6
Next
For $5=1 To $2
For $6=1 To $3
$9=$z($0,$5,1)<>$z($1,$6,1)
$7=1+$4[$5][$6-1]
$8=$9+$4[$5-1][$6-1]
$m=1+$4[$5-1][$6]
$m=$m>$7?$7:$m
$4[$5][$6]=$m>$8?$8:$m
Next
Next
Return $4[$2][$3]
EndFunc

测试代码示例:

ConsoleWrite(l("atoll", "bowl") & @LF)
ConsoleWrite(l("tar", "tarp") & @LF)
ConsoleWrite(l("turing", "tarpit") & @LF)
ConsoleWrite(l("antidisestablishmentarianism", "bulb") & @LF)

产量

3
1
4
27

3

k4,66个字节

{$[~#x;#y;~#y;#x;&/.z.s'[-1 0 -1_\:x;0 -1 -1_\:y]+1 1,~(*|x)=*|y]}

一种无聊且基本没有意义的算法实现。例如:

  f:{$[~#x;#y;~#y;#x;&/.z.s'[-1 0 -1_\:x;0 -1 -1_\:y]+1 1,~(*|x)=*|y]}
  f["kitten";"sitting"]
3
  f["atoll";"bowl"]
3
  f["tar";"tarp"]
1
  f["turing";"tarpit"]
4
  f["antidisestablishmentarianism";"bulb"]
27

3

严重的是86 82 78字节

,#,#`k;;;░="+l"£@"│d);)[]oq╜Riu)@d);)@[]oq╜Riu(@)@)@[]oq╜Ri3}@)=Y+km"£@IRi`;╗ƒ

十六进制转储:

2c232c23606b3b3b3bb03d222b6c229c4022b364293b295b5d6f71bd526975294064293b29405b
5d6f71bd5269752840294029405b5d6f71bd5269337d40293d592b6b6d229c40495269603bbb9f

在线试用

(请注意,该链接指向其他版本,因为与在线解释器有关的某些内容会与新的较短版本相冲突,即使它在可下载的解释器中也能正常工作。)

这是最直接的实现,认真考虑了递归定义。这很慢,因为它根本没有备注。表格格式的方法也许会更短一些(也许通过将寄存器用作行),但我对此感到很满意,尽管它包含了很多隐藏语言的缺点。那个可以用

[]oq`<code>`Ri

作为一个适当的两个参数的函数调用是一个不错的发现。

说明:

,#,#                             Read in two arguments, break them into lists of chars
    `                       `;╗ƒ put the quoted function in reg0 and immediately call it
     k;;;                        put the two lists in a list and make 3 copies
         ░                       replace the latter two with one with empty lists removed
          =                      replace two more with 1 if no empty lists removed, else 0
           "..."£@"..."£@        push the two functions described below, moving 
                                 the boolean above them both
                         I       select the correct function based on the condition
                          Ri     call the function, returning the correct distance
                                 for these substrings

   There are two functions that can be called from the main function above. Each expects 
   two strings, i and j, to be on the stack. This situation is ensured by putting 
   those strings in a list and using R to call the functions with that list as the stack.
   The first is very simple:

+l                             Concatenate the strings and take their length.
                               This is equivalent to the length of the longer
                               string, since one of the strings will be empty.

   The second function is very long and complicated. It will do the "insertion, deletion, 
   substitution" part of the recursive definition. Here's what happens in 4 parts:

│d);)                          After this, the stack is top[i-,j,i,j,ci,i-], where i- is 
                               list i with its last character, ci, chopped off.
     []oq                      this puts i- and j into a list so that they can be passed
                               as arguments recursively into the main function
         ╜Riu                  this calls the main function (from reg0) with the args
                               which will return a number to which we add 1 to get #d,
                               the min distance if we delete a character
)@d);)@                        After this, the stack is top[i,j-,ci,i-,#d,cj,j-], where 
                               j- and cj are the same idea as i- and ci
       []oq╜Riu                listify arguments, recurse and increment to get #i
                               (distance if we insert)
(@)@)@                         After this, the stack is top[i-,j-,#d,cj,#i,ci]
      []oq╜Ri                  listify arguments, recurse to get min distance between 
                               them but we still need to add 1 when we'd need to 
                               substitute because the chars we chopped off are different
(((@)                          After this, the stack is top[cj,ci,#s,#d,#i]
     =Y                        1 if they are not equal, 0 if they are
       +                       add it to the distance we find to get the distance
                               if we substitute here
        k                      put them all in a list
         m                     push the minimum distance over the three options

我喜欢代码试图如何逃脱的预元素:)
mınxomaτ

3

Python 3中,267个 216 184 162字节

此函数使用2 x len(word_2)+1大小合适的数组计算Levenshtein距离。

编辑:这与Willem的Python 2答案不太接近,但是这里是一个高尔夫球手的答案,到处都有很多细微的改进。

def e(p,q):
 m=len(q);r=range;*a,=r(m+1);b=[1]*-~m
 for i in r(len(p)):
  for j in r(m):b[j+1]=1+min(a[j+1],b[j],a[j]-(p[i]==q[j]))
  a,b=b,[i+2]*-~m
 return a[m]

取消高尔夫:

def edit_distance(word_1,word_2):
    len_1 = len(word_1)
    len_2 = len(word_2)
    dist = [[x for x in range(len_2+1)], [1 for y in range(len_2+1)]]
    for i in range(len_1):
        for j in range(len_2):
            if word_1[i] == word_2[j]:
                dist[1][j+1] = dist[0][j]
            else:
                deletion = dist[0][j+1]+1
                insertion = dist[1][j]+1
                substitution = dist[0][j]+1
                dist[1][j+1] = min(deletion, insertion, substitution)
        dist[0], dist[1] = dist[1], [i+2 for m in range(len_2+1)]
    return dist[0][len_2]

3

视网膜78 72字节

&`(.)*$(?<!(?=((?<-4>\4)|(?<-1>.(?<-4>)?))*,(?(4),))^.*,((.)|(?<-1>.))*)

在线尝试!

在某种程度上,这是一个纯正则表达式解决方案,其结果是正则表达式与之匹配的位置数。因为为什么不...

合理的警告,这是超级低效的。这种工作方式是将实际的优化工作转移到regex引擎的backtracker上,后者仅蛮力执行所有可能的对齐方式,从尽可能少的更改开始,并允许更多的对齐方式,直到可以用添加,删除和替换将字符串匹配为止。

对于更明智的解决方案,解决方案仅进行一次匹配,并且没有任何负面效果。在这里,结果是group中捕获的数量2,例如,您可以match.Groups[2].Captures.Count在C#中访问。但是,它仍然效率极低。

说明

我在上面解释第二个版本,因为从概念上讲它更容易一些(因为它只是一个正则表达式匹配项)。这是一个非公开的版本,我已经命名了这些群组(或使它们无法捕获)并添加了评论。请记住,应从后至前读取后向组件,但应从前至后读取其中的替代内容和前瞻。是的

.+                      # Ensures backtracking from smallest to largest for next repetition
(?<ops>(?<distance>.))* # This puts the current attempted distances onto two different stacks,
                        # one to work with, and one for the result.
$                       # Make sure the lookbehind starts from the end.
(?<=                    # The basic idea is now to match up the strings character by character,
                        # allowing insertions/deletions/substitutions at the cost of one capture
                        # on <ops>. Remember to read from the bottom up.
  (?=                   # Start matching forwards again. We need to go through the other string
                        # front-to-back due to the nature of the stack (the last character we
                        # remembered from the second string must be the first character we check
                        # against in the first string).
    (?:
      (?<-str>\k<str>)  # Either match the current character against the corresponding one from
                        # the other string.
    |
      (?<-ops>          # Or consume one operation to...
        .               # consume a character without matching it to the other string (a deletion)
        (?<-str>)?      # optionally dropping a character from the other string as well 
                        # (a substitution).
      )
    )*                  # Rinse and repeat.
    ,(?(str),)          # Ensure we reached the end of the first string while consuming all of the 
                        # second string. This is only possible if the two strings can be matched up 
                        # in no more than <distance> operations.
  )
  ^.*,                  # Match the rest of string to get back to the front.
  (?:                   # This remembers the second string from back-to-front.
    (?<str>.)           # Either capture the current character.
  |
    (?<-ops>.)          # Or skip it, consuming an operation. This is an insertion.
  )*
)

与72字节版本的唯一区别是,我们可以.+通过找到末端没有足够<ops>位置的位置并计算所有这些位置来删除开头(第二个开头的组)。


3

Haskell67 64字节

e@(a:r)#f@(b:s)=sum[1|a/=b]+minimum[r#f,e#s,r#s]
x#y=length$x++y

在线尝试!用法示例:"turing" # "tarpit"yields 4


说明(对于以前的67字节版本)

e@(a:r)#f@(b:s)|a==b=r#s|1<3=1+minimum[r#f,e#s,r#s]
x#y=length$x++y

这是一个递归解决方案。给定两个字符串ef,我们首先比较它们的第一个字符ab。如果它们相等,那么的Levenshtein距离ef相同的Levenshtein距离rs,所述的其余部分ef去除所述第一字符之后。否则,其中一个a或一个b需要删除,或者一个被另一个替换。[r#f,e#s,r#s]递归地计算这三种情况的Levenshtein,minimum选择最小的情况,并1添加以说明必要的删除或替换操作。

如果其中一个字符串为空,则在第二行。在这种情况下,距离只是非空字符串的长度,或者等效地是两个字符串串联在一起的长度。


1
哇,这是一个非常好的解决方案,真的很优雅也很短。
ggPeti

3

Python 3中105个 94 93字节

-11字节的xnor

l=lambda a,b:b>""<a and min(l(a[1:],b[1:])+(a[0]!=b[0]),l(a[1:],b)+1,l(a,b[1:])+1)or len(a+b)

Wikibooks上最短实现的高尔夫球版。

在线尝试!


不错的解决方案。l=由于该函数是递归的,因此需要包括并计算在内。您可以将基本案例合并为if b>""<a else len(a+b)
xnor

与运营商合作愉快,thanx!
movatica

2

Haskell,136个字节

致电e。慢一点antidisestablishmentarianism等等

l=length
e a b=v a(l a)b(l b)
v a i b j|i*j==0=i+j|0<1=minimum[1+v a(i-1)b j,1+v a i b(j-1),fromEnum(a!!(i-1)/=b!!(j-1))+v a(i-1)b(j-1)]

2

Jolf,4个字节

在这里尝试!

~LiI
~L   calculate the Levenshtein distance of
  i   input string
   I  and another input string

我昨天添加了内置函数,但是今天(即现在)看到了这一挑战。不过,这个答案是没有竞争力的。

在较新的版本中:

~Li

接受隐式第二个输入。


您的代码必须是程序或函数。它不必是命名函数,但不能是直接计算Levenshtein距离的内置函数。允许其他内置函数
Kevin Cruijssen

啊,没看到您提到它是不竞争的。最好将其放在标题中,或者添加没有内置功能的有效程序/函数。
凯文·克鲁伊森

2

GNU Prolog,133个字节

m([H|A],B):-B=A;B=[H|A].
d([H|A]-[H|B],D):-d(A-B,D).
d(A-B,D):-A=B,D=0;D#=E+1,m(A,X),m(B,Y),d(X-Y,E).
l(W,D):-d(W,D),(E#<D,l(W,E);!).

以元组为参数。用法示例:

| ?- l("turing"-"tarpit",D).

D = 4

yes

m指定对BA直接或通过其第一字符或者被移除。d用途m作为子程序来计算一个元组的元件之间的编辑距离(即,一系列修改的的距离,其将一个到另一个)。然后,这l是找到最小值的标准技巧d(您走任意距离,然后走任意较小的距离,重复进行直到您走不动为止)。


1

Perl中,168个 166 163字节

sub l{my($S,$T,$m)=(@_,100);$S*$T?do{$m=++$_<$m?$_:$m for l($S-1,$T),l($S,--$T),l(--$S,$T)-($a[$S]eq$b[$T]);$m}:$S||$T}print l~~(@a=shift=~/./g),~~(@b=shift=~/./g)

递归实现。保存为file.pl并运行为perl file.pl atoll bowl

sub l {
    my($S,$T,$m)=(@_,100);

    $S*$T
    ? do {
        $m = ++$_ < $m ? $_ : $m
        for
            l($S-1,   $T),
            l($S  , --$T),
            l(--$S,   $T) - ($a[$S] eq $b[$T])
        ;    
        $m
    }
    : $S||$T
}
print l~~(@a=shift=~/./g),~~(@b=shift=~/./g)


其他两个实现都更长(完整矩阵:237字节,两个单行迭代:187)。

  • 更新166:省略()呼叫l
  • 更新163return通过滥用do三元组来消除。


0

C,192字节

#define m(x,y) (x>y?x:y)
#define r(x,y) l(s,ls-x,t,lt-y)
int l(char*s,int ls,char*t,int lt){if(!ls)return lt;if(!lt)return ls;a=r(1,1);if(s[ls]==t[ls])return a;return m(m(r(0,1),r(1,0)),a)+1;}
---------

详细

#include <stdio.h>

#define m(x,y) (x>y?x:y)
#define f(x) char x[128];fgets(x,100,stdin)
#define r(x,y) l(s,ls-x,t,lt-y)

int l(char*s,int ls,char*t,int lt)
{
    if(!ls) return lt;
    if(!lt) return ls;

    int a = r(1,1);
    if(s[ls]==t[ls]) return a;

    return m(m(r(0,1),r(1,0)),a)+1;
}

int main(void)
{
    f(a);
    f(b);
    printf("%d", l(a,strlen(a),b,strlen(b)));
    return 0;
}

0

C#,215 210 198

public int L(string s,string t){int c,f,a,i,j;var v=new int[100];for(c=i=0;i<s.Length;i++)for(f=c=i,j=0;j<t.Length;j++){a=c;c=f;f=i==0?j+1:v[j];if(f<a)a=f;v[j]=c=s[i]==t[j]?c:1+(c<a?c:a);}return c;}

更具可读性:

public int L(string s,string t){
    int c,f,a,i,j;
    var v=new int[100];
    for(c=i=0;i<s.Length;i++)
        for(f=c=i,j=0;j<t.Length;j++){
            a=c;
            c=f;
            f=(i==0)?j+1:v[j];
            if (f<a) a=f;
            v[j]=c=(s[i]==t[j])?c:1+((c<a)?c:a);
        }
    return c;
}

0

PowerShell v3 +,247字节

$c,$d=$args;$e,$f=$c,$d|% le*;$m=[object[,]]::new($f+1,$e+1);0..$e|%{$m[0,$_]=$_};0..$f|%{$m[$_,0]=$_};1..$e|%{$i=$_;1..$f|%{$m[$_,$i]=(($m[($_-1),$i]+1),($m[$_,($i-1)]+1),($m[($_-1),($i-1)]+((1,0)[($c[($i-1)]-eq$d[($_-1)])]))|sort)[0]}};$m[$f,$e]

最后,我这样做是为了解决涉及LD的另一个挑战。

带注释的代码说明。

# Get both of the string passed as arguments. $c being the compare string
# and $d being the difference string. 
$c,$d=$args

# Save the lengths of these strings. $e is the length of $c and $f is the length of $d
$e,$f=$c,$d|% le*

# Create the multidimentional array $m for recording LD calculations
$m=[object[,]]::new($f+1,$e+1)

# Populate the first column 
0..$e|%{$m[0,$_]=$_}

# Populate the first row
0..$f|%{$m[$_,0]=$_}

# Calculate the Levenshtein distance by working through each position in the matrix. 
# Working the columns
1..$e|%{
    # Save the column index for use in the next pipeline
    $i=$_

    # Working the rows.
    1..$f|%{
        # Calculate the smallest value between the following values in the matrix relative to this one
        # cell above, cell to the left, cell in the upper left. 
        # Upper left also contain the cost calculation for this pass.    
        $m[$_,$i]=(($m[($_-1),$i]+1),($m[$_,($i-1)]+1),($m[($_-1),($i-1)]+((1,0)[($c[($i-1)]-eq$d[($_-1)])]))|sort)[0]
    }
}
# Return the last element of the matrix to get LD 
$m[$f,$e]

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.