完全对齐并连字符


26
Given  a width  and  a block  of
text containing possible hyphen-
ation points,  format it  fully-
justified (in monospace).

有充分理由意味着它是在左对齐右,和通过增加字间距直到每一行配合来实现的。

有关:

输入项

您可以采用任何喜欢的格式输入。您将获得:

  • 目标宽度(以字符为单位),范围为5-100(含);
  • 包含可能带有连字符的单词的文本块。这可以是用空格分隔的字符串,单词数组或单词片段数组的数组(或所需的任何其他数据表示形式)。

典型的输入可能是:

Width: 25
Text:  There's no bu-si-ne-ss lik-e s-h-o-w busine-ss, n-o bus-iness I know.

连字符表示可能的连字符点,空格表示单词边界。文本的可能替代表示形式:

[["There's"], ["no"], ["bu", "si", "ne", "ss"], ["lik", "e"], (etc.)]

输出量

输入文本,在单词之间添加空格,在列宽处添加换行符,并选择连字符点以使其完全与列宽对齐。对于函数,可以返回字符串数组(每行一个),而不使用换行符分隔。

上述输入的可能输出可能是:

There's no  business like
show  business,  no  bus-
iness I know.

请注意,除最后一个“ bus-iness”中的连字符外,所有连字符均已删除,该连字符始终显示该单词换行到下一行,并被选择以确保第二行包含尽可能多的文本。

规则

  • 在每一行中,单词之间的空格数不能超过1,但是在其他地方插入多余的空格取决于您:

    hello hi foo     bar    <-- not permitted (1,1,5)
    hello  hi foo    bar    <-- not permitted (2,1,4)
    hello  hi  foo   bar    <-- OK (2,2,3)
    hello  hi   foo  bar    <-- OK (2,3,2)
    hello   hi  foo  bar    <-- OK (3,2,2)
    
  • 任何行都不能以空格开头或结尾(最后一行除外,后者可以以空格结尾)。

  • 最后一行应对齐,每个单词之间应包含单个空格。如果需要,可以在其后跟随任意空格/换行符,但这不是必需的。

  • 单词将由AZ,az,0-9和简单标点符号(.,'()&)组成

  • 您可以假设没有一个单词片段长于目标宽度,并且始终可以根据规则填充行(即,每行上至少有2个单词片段,或1个单词片段可以填充该行)完美)

  • 您必须选择连字符点,以使前几行中的单词字符数量最大化(例如,单词必须由行贪婪地消耗),例如:

    This is an input stri-ng with hyph-en-at-ion poi-nts.
    
    This     is     an     input    stri-      <-- not permitted
    ng with hyphenation points.
    
    This  is an  input string  with hyph-      <-- not permitted
    enation points.
    
    This is an input  string with hyphen-      <-- OK
    ation points.
    
  • 以字节为单位的最短代码获胜

例子

Width: 20
Text:  The q-uick brown fox ju-mp-s ove-r t-h-e lazy dog.

The quick  brown fox
jumps over the  lazy
dog.

Width: 32
Text: Given a width and a block of text cont-ain-ing pos-sible hyphen-ation points, for-mat it ful-ly-just-ified (in mono-space).

Given  a width  and  a block  of
text containing possible hyphen-
ation points,  format it  fully-
justified (in monospace).

Width: 80
Text:  Pro-gram-ming Puz-zles & Code Golf is a ques-tion and ans-wer site for pro-gram-ming puz-zle enth-usi-asts and code golf-ers. It's built and run by you as part of the St-ack Exch-ange net-work of Q&A sites. With your help, we're work-ing to-g-et-her to build a lib-rary of pro-gram-ming puz-zles and their sol-ut-ions.

Programming Puzzles &  Code Golf  is a question and answer  site for programming
puzzle enthusiasts  and code golfers.  It's built and run  by you as part of the
Stack Exchange network  of Q&A sites. With your help,  we're working together to
build a library of programming puzzles and their solutions.

Width: 20
Text:  Pro-gram-ming Puz-zles & Code Golf is a ques-tion and ans-wer site for pro-gram-ming puz-zle enth-usi-asts and code golf-ers. It's built and run by you as part of the St-ack Exch-ange net-work of Q&A sites. With your help, we're work-ing to-g-et-her to build a lib-rary of pro-gram-ming puz-zles and their sol-ut-ions.

Programming  Puzzles
&  Code  Golf  is  a
question and  answer
site for programming
puzzle   enthusiasts
and  code   golfers.
It's  built  and run
by  you  as  part of
the  Stack  Exchange
network    of    Q&A
sites.   With   your
help,  we're working
together to  build a
library of  program-
ming   puzzles   and
their solutions.

Width: 5
Text:  a b c d e f g h i j k l mm nn oo p-p qq rr ss t u vv ww x yy z

a b c
d e f
g h i
j k l
mm nn
oo pp
qq rr
ss  t
u  vv
ww  x
yy z

Width: 10
Text:  It's the bl-ack be-ast of Araghhhhh-hhh-h-hhh-h-h-h-hh!

It's   the
black  be-
ast     of
Araghhhhh-
hhhhhhhhh-
hhh!

是的,最后是另一个(基于文本的)版式挑战:-)
ETHproductions'Jun

1
@Adám对内置函数是:没有代码限制,最短的代码获胜。当然,这可能会带来无聊的答案!至于库,只要库是免费提供的,您就可以将答案标记为“语言+库”。同样,库版本也必须提前解决这一挑战。
戴夫

1
如果一行可以以连字符或单个字符(例如anybod-y宽度为7)结束,我们可以选择输出anybody还是anybod-\ny
darrylyeo

1
@JonathanAllan是的;抱歉,我会解决此问题
Dave

3
@darrylyeo不,在这种情况下,您必须输出完整的单词,因为它必须贪婪地在每一行上包含尽可能多的单词字符。
戴夫

Answers:


7

JavaScript(ES6),218个字节

w=>s=>s.map((c,i)=>c.map((p,j)=>(k+p)[l="length"]-w-(b=!i|j>0)+(j<c[l]-1)<0?k+=b?p:" "+p:(Array(w-k[l]-b).fill(h=k.split` `).map((_,i)=>h[i%(h[l]-1)]+=" "),o.push(h.join` `+(b?"-":"")),k=p)),o=[],k="")&&o.join`
`+`
`+k

以currying语法(f(width)(text))接受参数,并且文本输入采用挑战中所述的双数组格式。字符串通过转换为该格式.split` `.map(a=>a.split`-`))。此外,换行符是模板字符串内的文字换行符。

取消高尔夫并重新排列

width=>string=> {
    out=[];
    line="";
    string.map((word,i)=> {
        word.map((part,j)=> {

            noSpaceBefore = i==0 || j>0;
            if ((line+part).length - width - noSpaceBefore + (j<word.length-1) < 0) {
                line += noSpaceBefore ? part : " "+part;
            }
            else {
                words=line.split` `;
                Array(width - line.length - noSpaceBefore).fill()
                    .map((_,i) => words[i % (words.length-1)] += " ");
                out.push(words.join(" ") + (noSpaceBefore? "-" : ""));
                line=part;
            }
        });
    });
    return out.join("\n") + "\n"+line
}

这里的想法是逐步遍历整个字符串的每一部分,并一次将每一行组成一个部分。一旦完成一行,它将从左到右增加单词间距,直到所有多余的空格都被放置为止。

测试片段

f=
w=>s=>s.map((c,i)=>c.map((p,j)=>(k+p)[l="length"]-w-(b=!i|j>0)+(j<c[l]-1)<0?k+=b?p:" "+p:(Array(w-k[l]-b).fill(h=k.split` `).map((_,i)=>h[i%(h[l]-1)]+=" "),o.push(h.join` `+(b?"-":"")),k=p)),o=[],k="")&&o.join`
`+`
`+k
<style>*{font-family:Consolas,monospace;}</style>
<div oninput="O.innerHTML=f(+W.value)(S.value.split` `.map(a=>a.split`-`))">
Width: <input type="number" size="3" min="5" max="100" id="W">
Tests: <select id="T" style="width:20em" oninput="let x=T.value.indexOf(','),s=T.value;W.value=s.slice(0,x);S.value=s.slice(x+2)"><option></option><option>20, The q-uick brown fox ju-mp-s ove-r t-h-e lazy dog.</option><option>32, Given a width and a block of text cont-ain-ing pos-sible hyphen-ation points, for-mat it ful-ly-just-ified (in mono-space).</option><option>80, Pro-gram-ming Puz-zles & Code Golf is a ques-tion and ans-wer site for pro-gram-ming puz-zle enth-usi-asts and code golf-ers. It's built and run by you as part of the St-ack Exch-ange net-work of Q&A sites. With your help, we're work-ing to-g-et-her to build a lib-rary of pro-gram-ming puz-zles and their sol-ut-ions.</option><option>20, Pro-gram-ming Puz-zles & Code Golf is a ques-tion and ans-wer site for pro-gram-ming puz-zle enth-usi-asts and code golf-ers. It's built and run by you as part of the St-ack Exch-ange net-work of Q&A sites. With your help, we're work-ing to-g-et-her to build a lib-rary of pro-gram-ming puz-zles and their sol-ut-ions.</option><option>5, a b c d e f g h i j k l mm nn oo p-p qq rr ss t u vv ww x yy z</option><option>10, It's the bl-ack be-ast of Araghhhhh-hhh-h-hhh-h-h-h-hh</option></select><br>
Text: &nbsp;<textarea id="S" cols="55" rows="4"></textarea>
</div>
<pre id="O" style="border: 1px solid black;display:inline-block;"></pre>


8

GNU sed -r,621字节

将输入作为两行:首先将宽度作为一元数,其次是字符串。

我敢肯定这可能会打更多,但我已经花了太多时间在里面。

x;N
G
s/\n/!@/
:
/@\n/bZ
s/-!(.*)@ /\1 !@/
s/!(.*[- ])(@.*1)$/\1!\2/
s/@(.)(.*)1$/\1@\2/
s/-!(.*-)(@.*)\n$/\1!\2\n1/
s/(\n!@) /\1/
s/-!(.* )(@.*)\n$/\1!\2\n1/
s/-!(.*-)(@.*1)$/\1!\21/
s/!(.*)-@([^ ]) /\1\2!@ /
t
s/ !@(.*)\n$/\n!@\1#/
s/!(.*-)@(.*)\n$/\1\n!@\2#/
s/!(.*)(@ | @)(.*)\n$/\1\n!@\3#/
s/-!(.*[^-])@([^ ]) (.*)\n$/\1\2\n!@\3#/
s/!(.+)@([^ ].*)\n$/\n!@\1\2#/
/#|!@.*\n$/{s/#|\n$//;G;b}
:Z
s/-?!|@.*//g
s/ \n/\n/g
s/^/%/
:B
G
/%.*\n.+\n/!bQ
:C
s/%([^\n])(.*)1$/\1%\2/
tC
s/([^\n]+)%\n/%\1\n/
:D
s/%([^ \n]* )(.*)1$/\1 %\2/
tD
s/(^|\n)([^\n]+)%(.*1)$/\1%\2\3/
tD
s/%([^\n]*)\n(.*)\n$/\1\n%\2/
tB
:Q
s/%(.*)\n1*$/\1/

在线尝试!

说明

该程序分为两个阶段:1.拆分和2.对齐。对于以下内容,假设我们的输入是:

111111111111
I re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.

设定

首先,我们读取输入,将第一行(宽度作为一元数)移动到保留空间(x),然后添加下一行(N),然后将宽度的副本从保留空间(G)附加到模式空间。由于N给我们留下了领先者,\n我们将其替换为!@,我们将在第1阶段将其用作光标。

x;N
G
s/\n/!@/

现在,保留空间的内容为1111111111111(以后不会更改),而模式空间为(以sed的“明确打印” l命令的格式):

!@I re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n111111111111$

阶段1

在阶段1中,主@光标每次前进一个字符,并且对于每个字符1,在模式空间末尾从“计数器”中删除一个字符。换句话说,@foo\n111$f@oo\n11$fo@o\n1$,等。

!后面光标步道@光标,标志着地方如果计数器在该行的中游0,我们可以打破。几轮如下所示:

!@I re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n111111111111$
!I@ re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n11111111111$
!I @re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n1111111111$

这里有一个我们可以识别的模式:@光标紧随其后的空格。由于计数器大于0,因此我们将前进中断标记,然后继续前进主光标:

I !@re-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n1111111111$
I !r@e-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n111111111$
I !re@-mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n11111111$
I !re-@mem-ber a time of cha-os, ru-ined dreams, this was-ted land.\n1111111$

这是另一种模式:-@,并且计数器中还有7个,因此我们再次前进Break光标并继续前进:

I re-!mem-@ber a time of cha-os, ru-ined dreams, this was-ted land.\n111$

这是一种不同的模式:连字符位于中断光标之前,而连字符位于主光标之前。我们删除了第一个连字符,前进了中断光标,并且由于删除了一个字符,所以将1加到了计数器上。

I remem-!@ber a time of cha-os, ru-ined dreams, this was-ted land.\n1111$

我们继续前进主要光标:

I remem-!ber@ a time of cha-os, ru-ined dreams, this was-ted land.\n1$

与之前类似,但是这次主光标位于空格之前,而不是连字符之后。我们删除了连字符,但是由于我们也正在前进主光标,所以我们既不增加也不减少计数器。

I remember !@a time of cha-os, ru-ined dreams, this was-ted land.\n1$
I remember !a@ time of cha-os, ru-ined dreams, this was-ted land.\n$

最终,我们的计数器已达到零。由于主光标后面的字符是一个空格,因此我们插入一个换行符,并将两个光标都紧接在它之后。然后,我们补充计数器(G),然后重新开始。

I remember a\n!@ time of cha-os, ru-ined dreams, this was-ted land.\n111111111111$

阶段1继续进行,前进光标并匹配各种模式,直到@光标到达字符串的末尾。

# Phase 1
:
  # End of string; branch to :Z (end of phase 1)
  /@\n/bZ

  # Match -!.*@_
  s/-!(.*)@ /\1 !@/

  # Match [-_]@ and >0
  s/!(.*[- ])(@.*1)$/\1!\2/

  # Advance cursor
  s/@(.)(.*)1$/\1@\2/

  # Match -!.*-@ and 0; add 1
  s/-!(.*-)(@.*)\n$/\1!\2\n1/

  # Match \n!@_
  s/(\n!@) /\1/

  # Match -!.*_@ and 0; add 1
  s/-!(.* )(@.*)\n$/\1!\2\n1/

  # Match -!.*-@ and >0; add 1
  s/-!(.*-)(@.*1)$/\1!\21/

  # Match -@[^_]_
  s/!(.*)-@([^ ]) /\1\2!@ /

  # If there were any matches, branch to `:`
  t

  # Match _!@ and 0
  s/ !@(.*)\n$/\n!@\1#/

  # Match -@ and 0
  s/!(.*-)@(.*)\n$/\1\n!@\2#/

  # Match @_|_@ and 0
  s/!(.*)(@ | @)(.*)\n$/\1\n!@\3#/

  # Match -!.*[^-]@[^_]_ and 0
  s/-!(.*[^-])@([^ ]) (.*)\n$/\1\2\n!@\3#/

  # Match !.+@[^_] and 0
  s/!(.+)@([^ ].*)\n$/\n!@\1\2#/

  # Match marked line (#) or !@ and 0
  /#|!@.*\n$/{
    # Remove mark; append width and branch to `:`
    s/#|\n$//
    G
    b
  }

:Z

# Cleanup
s/-?!|@.*//g
s/ \n/\n/g

在第1阶段结束时,我们的模式空间如下所示:

I remember a\ntime of cha-\nos, ruined\ndreams, this\nwasted land.

要么:

I remember a
time of cha-
os, ruined
dreams, this
wasted land.

阶段2

在第2阶段中,我们%用作游标,并以类似的方式使用计数器,如下所示:

%I remember a\ntime of cha-\nos, ruined\ndreams, this\nwasted land.\n111111111111$

首先,我们通过移动光标并从计数器中删除1来对第一行中的字符进行计数,然后得到;

I remember a%\ntime of cha-\nos, ruined\ndreams, this\nwasted land.\n$

由于计数器为0,因此我们在此行上不执行其他任何操作。第二行也具有与计数器相同的字符数,因此让我们跳到第三行:

I remember a\ntime of cha-\nos, ruined%\ndreams, this\nwasted land.\n11$

计数器大于0,因此我们将光标移回该行的开头。然后我们找到第一个空格并添加一个空格,使计数器递减。

I remember a\ntime of cha-\nos, % ruined\ndreams, this\nwasted land.\n1$

计数器大于0;由于光标已经在行的最后(唯一)行中,因此我们将其移回行的开头并再次执行:

I remember a\ntime of cha-\nos,  % ruined\ndreams, this\nwasted land.\n$

现在计数器为0,因此将光标移至下一行的开头。我们对除最后一行以外的所有行重复此操作。那是第二阶段的结束,并且程序的结束!最终结果是:

I remember a
time of cha-
os,   ruined
dreams, this
wasted land.
# Phase 2
# Insert cursor
s/^/%/
:B
  # Append counter from hold space
  G
  # This is the last line; branch to :Q (end of phase 1)
  /%.*\n.+\n/!bQ

  :C
    # Count characters
    s/%([^\n])(.*)1$/\1%\2/
    tC

  # Move cursor to beginning of line
  s/([^\n]+)%\n/%\1\n/

  :D
    # Add one to each space on the line as long as counter is >0
    s/%([^ \n]* )(.*)1$/\1 %\2/
    tD

    # Counter is still >0; go back to beginning of line
    s/(^|\n)([^\n]+)%(.*1)$/\1%\2\3/
    tD

    # Counter is 0; move cursor to next line and branch to :B
    s/%([^\n]*)\n(.*)\n$/\1\n%\2/
    tB

:Q

# Remove cursor, any remaining 1s
s/%(.*)\n1*$/\1/

这太不可思议了,但是当我使用gsed (GNU sed) 4.4get来运行它时gsed: -e expression #1, char 16: ":" lacks a label。您可以添加有关如何调用它的注释吗?(我正在使用printf "%s\n%s" "$1" "$2" | gsed -r '<code here>';
Dave

@Dave在GNU sed 4.2中对我有用。这是一个要点:gist.github.com/jrunning/91a7584d95fe10ef6b036d1c82bd385c注意TiO的sed页面似乎不尊重该-r标志,这就是为什么上面的TiO链接转到bash页面的原因。
乔丹

啊,我没有注意到TiO链接。那会为我做的;+1!但是,在最后一个示例中有2个小错误(“黑兽”一个):它以短字符显示倒数第二行,并错过最后一个字符!(尽管由于我错过!了可能的特殊字符列表,所以我不会对此表示反对)。
戴夫

5

JavaScript(ES6),147个字节

将输入作为(width)(text)

w=>F=(s,p=S=' ')=>(g=([c,...b],o='',h=c=='-')=>c?o[w-1]?c==S&&o+`
`+F(b):o[w+~h]?o+c+`
`+F(b):c>S?g(b,h?o:o+c):g(b,o+p)||g(b,o+p+c):o)(s)||F(s,p+S)

在线尝试!

已评论

w =>                              // w = requested width
  F = (                           // F is a recursive function taking:
    s,                            //   s = either the input string (first iteration) or an
                                  //       array of remaining characters (next iterations)
    p =                           //   p = current space padding
    S = ' '                       //   S = space character
  ) => (                          //
    g = (                         // g is a recursive function taking:
      [c,                         //   c   = next character
          ...b],                  //   b[] = array of remaining characters
      o = '',                     //   o   = output for the current line
      h = c == '-'                //   h   = flag set if c is a hyphen
    ) =>                          //
      c ?                         // if c is defined:
        o[w - 1] ?                //   if the line is full:
          c == S &&               //     fail if c is not a space
          o + `\n` + F(b)         //     otherwise, append o + a linefeed and process the
                                  //     next line
        :                         //   else:
          o[w + ~h] ?             //     if this is the last character and c is a hyphen:
            o + c + `\n` + F(b)   //       append o + c + a linefeed and process the next
                                  //       line
          :                       //     else, we process the next character:
            c > S ?               //       if c is not a space:
              g(b, h ? o : o + c) //         append c if it's not a hyphen
            :                     //       else:
              g(b, o + p) ||      //         append either the current space padding
              g(b, o + p + c)     //         or the current padding and one extra space
      :                           // else:
        o                         //   success: return o
  )(s)                            // initial call to g() with s
  || F(s, p + S)                  // in case of failure, try again with a larger padding

4

APL(Dyalog Unicode)的129 123 121 118 111 109 107 104 100 95字节SBCS

{⊃⌽m←⍺≥-⌿c⍪+\⊢c' -'∘.≠⍵:⊂⍵/⍨⊢⌿c⋄(⊂∊ll[(⍺-≢l)⍴⍸' '=l],←⊃0l←⍵/⍨n×⊣⌿c⊖⍨1n),⍺∇⍵/⍨~n←⌽∨\⌽m>×⌿c}

在线尝试!



1

Python 2 343字节

W,T=input()
T+=' '
L,l=[],len
while T:
 p,r=0,''
 for i in range(l(T)):
  s=T[:i].replace('-','')
  if'-'==T[i]:s+='-'
  if T[i]in' -'and W-l(s)>=0:p,r=i,s
 R=r.split()
 if R:
  d,k=W-l(''.join(R)),0
  for j in range(d):
   R[k]+=' '
   k+=1
   if k==l(R)-1:k=0
  L+=[''.join(R)]
  T=T[p+1:]
print'\n'.join(L[:-1])
print' '.join(L[-1].split())

在线尝试!

The  input  is a block of text
containing possibly hyphenated
words.  For  each space/hyphen
position  p  the code computes
l(p)  the  length  of the line
induced  by  slipping the text
to this space/hyphen. Then the
code choses the position p for
which  the  length l(p) is the
closest  to  the given width W
(and  l(p)<=W).  If l(p)<W the
code  adds spaces  fairly  in-
between  the  words to achieve
the length W.

尽管输入可以是您喜欢的任何格式,但仍应来自STDIN或参数。请参阅I / O的默认值。通常,我们不允许“输入”来自预先分配的变量。
mbomb007

您可以通过执行以下操作print'\n'.join(L[:-1])来保存字节,而不是for e in L[:-1]:print e
mbomb007

@ mbomb007好的,我会做一些必要的更改以尊重I / O
mdahmoune
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.