解析Bookworm词典格式


42

我最近以Bookworm Deluxe的形式沉迷于某种怀旧:

如果您以前从未看过它,那么这是一个文字游戏,目标是将相邻的图块连接起来以形成单词。为了确定字符串是否为有效单词,它会根据内部字典检查它,该内部字典以如下所示的压缩格式存储:

aa
2h
3ed
ing
s
2l
3iis
s
2rdvark
8s
4wolf
7ves

解开字典的规则很简单:

  1. 阅读该行开头的数字,并从前一个单词的开头复制那么多字符。(如果没有数字,请复制与上次一样多的字符。)

  2. 在该词后附加以下字母。

因此,我们的第一个单词是aa,其后是2h,表示“复制的前两个字母aa并附加h”,形成aah。然后3ed变为aahed,并且由于下一行没有数字,因此我们再次复制3个字符以形成aahing。该过程将在字典的其余部分继续进行。小样本输入得到的结果是:

aa
aah
aahed
aahing
aahs
aal
aaliis
aals
aardvark
aardvarks
aardwolf
aardwolves

您面临的挑战是如何以尽可能少的字节执行此拆包。

输入的每一行将包含零个或多个数字,0-9 后跟一个或多个小写字母a-z。您可以接受输入并将输出作为字符串列表,或作为单个字符串,且单词之间用0-9/ 以外的任何字符分隔a-z

这是另一个小测试用例,示例中未涉及一些边缘情况:

abc cba 1de fg hi 0jkl mno abcdefghijk 10l
=> abc cba cde cfg chi jkl mno abcdefghijk abcdefghijl

您也可以在完整的字典上测试您的代码: inputoutput


第二行中是否有可能没有数字?另外,我们是否可以假设除0前导数字之外没有数字0
暴民埃里克(Erik the Outgolfer)

@EriktheOutgolfer是的,这是可能的;我已经将其添加到测试用例中。是的,您可以假设(以及数字不会大于前一个单词的长度)。
Doorknob

11
那是一种可爱的压缩格式:]

1
locate程序使用这种类型的路径名上编码。
Dan D.

大约15年前,我写了这个程序供实际使用。不幸的是,我认为我不再有消息来源了……
hobbs

Answers:



10

JavaScript(ES6), 66 62  61字节

a=>a.map(p=s=>a=a.slice([,x,y]=/(\d*)(.*)/.exec(s),p=x||p)+y)

在线尝试!

已评论

a =>                  // a[] = input, re-used to store the previous word
  a.map(p =           // initialize p to a non-numeric value
  s =>                // for each string s in a[]:
    a =               //   update a:
      a.slice(        //     extract the correct prefix from the previous word:
        [, x, y] =    //       load into x and y:
          /(\d*)(.*)/ //         the result of a regular expression which splits the new
          .exec(s),   //         entry into x = leading digits and y = trailing letters
                      //       this array is interpreted as 0 by slice()
        p = x || p    //       update p to x if x is not an empty string; otherwise leave
                      //       it unchanged; use this as the 2nd parameter of slice()
      )               //     end of slice()
      + y             //     append the new suffix
  )                   // end of map()

5

Perl 6的50 48个字节

-2字节归功于nwellnhof

{my$l;.map:{$!=S[\d*]=substr $!,0,$l [R||]=~$/}}

在线尝试!

的端口阿尔诺的解决方案。那个R||招是过山车,从“我认为这是可能的”,到“不,这是不可能的”,再到“也许是可能的”,最后是“啊哈!”

说明:

{my$l;.map:{$!=S[\d*]=substr $!,0,$l [R||]=~$/}}
{                                              }  # Anonymous code block
 my$l;    # Declare the variable $l, which is used for the previous number
      .map:{                                  }  # Map the input list to
            $!=              # $! is used to save the previous word
               S[\d*]=       # Substitute the number for
                      substr $!,0    # A substring of the previous word
                                 ,              # With the length of 
                                           ~$0     # The num if it exists
                                  $l [R||]=        # Otherwise the previous num

$l [R||]=~$/部分大致翻译为,$l= ~$/||+$l但...具有相同数量的字节:(。最初,它使用匿名变量保存字节,因此my$l消失了,但这不起作用,因为作用域现在是替换项,而不是map代码块。那好吧。无论如何,R是反向元运算符,因此它会反转的参数||,因此该$l变量最终会被分配新的数字(~$/存在,),否则将再次自身。

如果Perl 6没有为抛出某种冗余的编译器错误,则可能为47个字节=~


5

红宝石49 45 43字节

$0=$_=$0[/.{0#{p=$_[/\d+/]||p}}/]+$_[/\D+/]

在线尝试!

说明

$0=                                         #Previous word, assign the value of
   $_=                                      #Current word, assign the value of
      $0[/.{0#{              }}/]           #Starting substring of $0 of length p which is
               p=$_[/\d+/]||p               #defined as a number in the start of $_ if any 
                                 +$_[/\D+/] #Plus any remaining non-digits in $_

5

C,65 57字节

n;f(){char c[99];while(scanf("%d",&n),gets(c+n))puts(c);}

在线尝试!

说明:

n;                     /* n is implicitly int, and initialized to zero. */

f() {                  /* the unpacking function. */

    char c[99];        /* we need a buffer to read into, for the longest line in
                          the full dictionary we need 12 + 1 bytes. */

    while(             /* loop while there is input left. */

        scanf("%d",&n) /* Read into n, if the read fails because this line
                          doesn't have a number n's value does not change.
                          scanf's return value is ignored. */

        ,              /* chain expressions with the comma operator. The loop
                          condition is on the right side of the comma. */

        gets(c+n))     /* we read into c starting from cₙ. c₀, c₁.. up to cₙ is
                          the shared prefix of the word we are reading and the
                          previous word. When gets is successful it returns c+n
                          else it will return NULL. When the loop condition is
                          NULL the loop exits. */

        puts(c);}      /* print the unpacked word. */

5

脑干,201字节

,[[[-<+>>>+<<]>-[---<+>]<[[-<]>>]<[-]>>[<<,>>>[-[-<++++++++++>]]++++<[->+<]-[----->-<]<]<]>>>[[>>]+[-<<]>>[[>>]+[<<]>>-]]+[>>]<[-]<[<<]>[->[>>]<+<[<<]>]>[>.>]+[>[-]<,.[->+>+<<]>>----------]<[<<]>-<<<,]

在线尝试!

在输入的末尾需要尾随换行符。没有此要求的版本长6个字节:

笨蛋,207字节

,[[[-<+>>>+<<]>-[---<+>]<[[-<]>>]<[-]>>[<<,>>>[-[-<++++++++++>]]++++<[->+<]-[----->-<]<]<]>>>[[>>]+[-<<]>>[[>>]+[<<]>>-]]+[>>]<[-]<[<<]>[->[>>]<+<[<<]>]>[>.>]+[>[-]<,[->+>+<<]>>[----------<.<]>>]<[<<]>-<<<,]

在线尝试!

两种版本均假定所有数字均严格小于255。

说明

磁带的布局如下:

tempinputcopy 85 0 inputcopy number 1 a 1 a 1 r 1 d 0 w 0 o 0 l 0 f 0 ...

如果未输入数字,则“数字”单元等于0;如果输入数字n,则“ n + 1”单元等于n + 1。在标有“ 85”的单元格上输入。

,[                     take input and start main loop
 [                     start number input loop
  [-<+>>>+<<]          copy input to tempinputcopy and inputcopy
  >-[---<+>]           put the number 85 in the cell where input was taken
  <[[-<]>>]            test whether input is less than 85; ending position depends on result of comparison
                       (note that digits are 48 through 57 while letters are 97 through 122)
  <[-]>                clean up by zeroing out the cell that didn't already become zero
  >[                   if input was a digit:
   <<,>>               get next input character
   >[-[-<++++++++++>]] multiply current value by 10 and add to current input
   ++++                set number cell to 4 (as part of subtracting 47)
   <[->+<]             add input plus 10*number back to number cell
   -[----->-<]         subtract 51
  <]                   move to cell we would be at if input were a letter
 <]                    move to input cell; this is occupied iff input was a digit

                       part 2: update/output word

 >>>                   move to number cell
 [                     if occupied (number was input):
  [>>]+[-<<]>>         remove existing marker 1s and decrement number cell to true value
  [[>>]+[<<]>>-]       create the correct amount of marker 1s
 ]
 +[>>]<[-]             zero out cell containing next letter from previous word
 <[<<]>                return to inputcopy
 [->[>>]<+<[<<]>]      move input copy to next letter cell
 >[>.>]                output word so far
 +[                    do until newline is read:
  >[-]<                zero out letter cell
  ,.                   input and output next letter or newline
  [->+>+<<]            copy to letter cell and following cell
  >>----------         subtract 10 to compare to newline
 ]
 <[<<]>-               zero out number cell (which was 1 to make copy loop shorter)
 <<<,                  return to input cell and take input
]                      repeat until end of input

4

Python 3.6以上版本, 172 195 156 123 122 121 104字节

import re
def f(l,n=0,w=""):
 for s in l:t=re.match("\d*",s)[0];n=int(t or n);w=w[:n]+s[len(t):];yield w

在线尝试!

说明

我屈服了,并使用了正则表达式。这样可以节省至少17个字节。:

t=re.match("\d*",s)[0]

如果字符串根本不是以数字开头,则该字符串的长度为0。这意味着:

n=int(t or n)

将为nif t为空,int(t)否则为。

w=w[:n]+s[len(t):]

删除从正则表达式中找到s的数字(如果没有找到数字,它将删除0字符,不被s截断),并用n当前单词片段替换前一个单词的除第一个字符以外的所有字符;和:

yield w

输出当前单词。


4

Haskell,82 81字节

tail.map concat.scanl p["",""]
p[n,l]a|[(i,r)]<-reads a=[take i$n++l,r]|1<2=[n,a]

获取并返回字符串列表。

在线尝试!

        scanl p["",""]        -- fold function 'p' into the input list starting with
                              -- a list of two empty strings and collect the
                              -- intermediate results in a list
  p [n,l] a                   -- 1st string of the list 'n' is the part taken form the last word
                              -- 2nd string of the list 'l' is the part from the current line
                              -- 'a' is the code from the next line
     |[(i,r)]<-reads a        -- if 'a' can be parsed as an integer 'i' and a string 'r'
       =[take i$n++l,r]       -- go on with the first 'i' chars from the last line (-> 'n' and 'l' concatenated) and the new ending 'r'
     |1<2                     -- if parsing is not possible
       =[n,a]                 -- go on with the previous beginning of the word 'n' and the new end 'a'
                              -- e.g. [         "aa",     "2h",      "3ed",       "ing"       ] 
                              -- ->   [["",""],["","aa"],["aa","h"],["aah","ed"],["aah","ing"]]
  map concat                  -- concatenate each sublist
tail                          -- drop first element. 'scanl' saves the initial value in the list of intermediate results. 

编辑:-1字节感谢@Nitrodon。


1
与通常的Haskell打高尔夫球相反,您实际上可以通过将辅助函数定义为中缀运算符来在此处节省一个字节。
Nitrodon

@Nitrodon:很好发现!谢谢!
nimi

3

Japt,19 18 17字节

最初受Arnauld的JS解决方案启发。

;£=¯V=XkB ªV +XoB

试试吧

                      :Implicit input of string array U
 £                    :Map each X
   ¯                  :  Slice U to index
      Xk              :    Remove from X
;       B             :     The lowercase alphabet (leaving only the digits or an empty string, which is falsey)
          ªV          :    Logical OR with V (initially 0)
    V=                :    Assign the result to V for the next iteration
             +        :  Append
              Xo      :  Remove everything from X, except
;               B     :   The lowercase alphabet
  =                   :  Reassign the resulting string to U for the next iteration

2

果冻,16字节

⁹fØDVo©®⁸ḣ;ḟØDµ\

在线尝试!

这个怎么运作

⁹fØDVo©®⁸ḣ;ḟØDµ\  Main link. Argument: A (array of strings)

              µ\  Cumulatively reduce A by the link to the left.
⁹                     Yield the right argument.
  ØD                  Yield "0123456789".
 f                    Filter; keep only digits.
    V                 Eval the result. An empty string yields 0.
     o©               Perform logical OR and copy the result to the register.
       ®              Yield the value in the register (initially 0).
        ⁸ḣ            Head; keep that many character of the left argument.
          ;           Concatenate the result and the right argument.
            ØD        Yield "0123456789".
           ḟ          Filterfalse; keep only non-digits.


1

视网膜0.8.2,69字节

+`((\d+).*¶)(\D)
$1$2$3
\d+
$*
+m`^((.)*(.).*¶(?<-2>.)*)(?(2)$)1
$1$3

在线尝试!链接包括较难的测试用例。说明:

+`((\d+).*¶)(\D)
$1$2$3

对于所有以字母开头的行,请复制前一行的数字,并循环播放,直到所有行均以数字开头。

\d+
$*

将数字转换为一元。

+m`^((.)*(.).*¶(?<-2>.)*)(?(2)$)1
$1$3

使用平衡组将所有1s 替换为前一行中的相应字母。(事实证明,这比替换1s的所有游标要稍微打高尔夫球。)




1

Groovy,74个字节

{w="";d=0;it.replaceAll(/(\d*)(.+)/){d=(it[1]?:d)as int;w=w[0..<d]+it[2]}}

在线尝试!

说明:

{                                                                        }  Closure, sole argument = it
 w="";d=0;                                                                  Initialize variables
          it.replaceAll(/(\d*)(.+)/){                                   }   Replace every line (since this matches every line) and implicitly return. Loop variable is again it
                                     d=(it[1]?:d)as int;                    If a number is matched, set d to the number as an integer, else keep the value
                                                        w=w[0..<d]+it[2]    Set w to the first d characters of w, plus the matched string


0

Perl 5中 -p45 41个字节

s:\d*:substr($p,0,$l=$&+$l*/^\D/):e;$p=$_

在线尝试!

说明:

s:\d*:substr($p,0,$l=$&+$l*/^\D/):e;$p=$_ Full program, implicit input
s:   :                           :e;      Replace
  \d*                                       Any number of digits
      substr($p,0,              )           By a prefix of $p (previous result or "")
                  $l=  +                      With a length (assigned to $l) of the sum
                     $&                         of the matched digits
                          *                     and the product
                        $l                        of $l (previous length or 0)
                           /^\D/                  and whether there is no number in the beginning (1 or 0)
                                                (product is $l if no number)
                                    $p=$_ Assign output to $p
                                          Implicit output


0

05AB1E20 19 17 字节

õUvyþDõÊi£U}Xyá«=

在线尝试验证所有测试用例

说明:

õ                  # Push an empty string ""
 U                 # Pop and store it in variable `X`
v                  # Loop `y` over the (implicit) input-list
 yþ                #  Push `y`, and leave only the digits (let's call it `n`)
   DõÊi  }         #  If it's NOT equal to an empty string "":
       £           #   Pop and push the first `n` characters of the string
        U          #   Pop and store it in variable `X`
          X        #  Push variable `X`
           yá      #  Push `y`, and leave only the letters
             «     #  Merge them together
              =    #  Print it (without popping)

0

普通Lisp,181字节

(do(w(p 0))((not(setf g(read-line t()))))(multiple-value-bind(a b)(parse-integer g :junk-allowed t)(setf p(or a p)w(concatenate'string(subseq w 0 p)(subseq g b)))(format t"~a~%"w)))

在线尝试!

取消高尔夫:

(do (w (p 0))   ; w previous word, p previous integer prefix (initialized to 0)
    ((not (setf g (read-line t ()))))   ; read a line into new variable g
                                        ; and if null terminate: 
  (multiple-value-bind (a b)            ; let a, b the current integer prefix
      (parse-integer g :junk-allowed t) ; and the position after the prefix
    (setf p (or a p)                    ; set p to a (if nil (no numeric prefix) to 0)
          w (concatenate 'string        ; set w to the concatenation of prefix
             (subseq w 0 p)             ; characters from the previous word 
             (subseq g b)))             ; and the rest of the current line
    (format t"~a~%"w)))                 ; print the current word

像往常一样,Common Lisp的长标识符使它特别不适用于PPCG。



0

C#(Visual C#交互式编译器),134字节

a=>{int l=0,m,n;var p="";return a.Select(s=>{for(m=n=0;s[m]<58;n=n*10+s[m++]-48);return p=p.Substring(0,l=m>0?n:l)+s.Substring(m);});}

在线尝试!

-9个字节感谢@ASCIIOnly!

少打高尔夫球...

// a is an input list of strings
a=>{
  // l: last prefix length
  // m: current number of digits
  // n: current prefix length
  int l=0,m,n;
  // previous word
  var p="";
  // run a LINQ select against the input
  // s is the current word
  return a.Select(s=>{
    // nibble digits from start of the
    // current word to build up the
    // current prefix length
    for(m=n=0;
      s[m]<58;
      n=n*10+s[m++]-48);
    // append the prefix from the
    // previous word to the current
    // word and capture values
    // for the next iteration
    return
      p=p.Substring(0,l=m>0?n:l)+
      s.Substring(m);
  });
}


真是太酷了:)我改用了l=n>0?n:ll=m>0?n:l因为当一行以零(0jkl)开头时,它并没有引起注意。谢谢你的提示!
dana

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.