创建一个Twitter解析器


14

介绍

您使用Twitter(如果没有,请假装),在这里您想与全世界分享的每条推文最多只能有140个字符。如果您想发推文给您的追随者亚伯拉罕·林肯(Abraham Lincoln)的葛底斯堡地址Gettysburg Address),则需要将文本分成140个字符的大块,以使整个消息完整。但是,这些块不应该总是正好为140个字符长。举例来说,我们将演讲分为17个字符,最后得到了以下推文:

  • 四分SE
  • 以前的我们
  • 布鲁克林
  • 在这个骗局上
  • 租一个新的NATI
  • 关于L
  • (等等)

那不好!当个别单词分解后,可能会很难理解您要说的内容。此外,在twitterverse中,您的关注者之一可能会遇到一条特定的tweet,但并未意识到消息的更多内容,因此,您需要对tweet进行编号,以便它们可以使用一些上下文(仍使用17个字符的块) ):

  • (1/7)四分和
  • (2/7)七年前
  • (3/7)我们的对手
  • (4/7)爆发
  • (5/7)此大陆A
  • (6/7)新国家
  • (7/7)在...中

您可以手动找出适合您的推文的最佳配置,但这就是我们拥有的计算机的功能!

挑战

在最短的代码中,将葛底斯堡地址(或任何文本,但以该文本为例)解析为一组不超过140个字符的推文(假定为ASCII,因为示例文本不应包含其中的任何不常见/不寻常的位)。

细节

  • 您的函数/程序/等应该接受一个字符串参数,并为每条推文输出一行文本。
    • 假设无论您选择哪种解析方式(只要该选择仍然适合挑战的其他方面),解析时此输入将永远不会导致超过99条总推文。
  • 推文需要在推文主体之前包含“ ” 格式tweet numbertotal tweets指示符(x/y)
    • 此计数将占用您140个字符的空间!
  • 推文块只能在换行符或空格上分割。
    • 除非紧接在空格或换行符之前或之后,否则不允许使用连字符,句号,逗号或其他标点符号。
  • 推文应包含尽可能多的完整单词。
    • 这种约束有点灵活,例如,当您的最终推文只有一个字时
  • 这是代码高尔夫球,所以最短的代码获胜。

葛底斯堡演说全文

(您的代码仍然应该能够处理传递给它的任何ASCII字符串。)

四年零七年前,我们的兄弟们在这个大陆上崭露头角,这是一个自由想象的新民族,致力于实现所有男人都享有同等地位的主张。现在,我们参与了一场伟大的民用战争测试,无论这种民族还是这么一个如此设想和奉献的民族都可以长期承受。我们在那场战争中大获全胜。我们已经将那部分田地作为最终的安息之所,专门用来拯救那些赋予他们生命的国家。总而言之,我们应该做到这一点。但更大的意义是,我们不能奉献,我们不能奉献,我们不能屈服。曾被困在这里的勇敢的人过世并丧生,他们在我们可怜的力量之上或不足以奉献它。世界将永远不会记住我们所说的一切,但它永远不会忘记他们在这里所做的一切。对于我们来说,生活品味是专门针对尚未完成的工作而进行的,远远超出了他们。对于我们来说,这是我们奉献给仍然存在的伟大任务的原因,这是我们自始至终认为奉献的原因,因为他们对奉献有了最后的全面衡量,因此我们要高度解决,但绝不应该这样做已经死了,这民族在神的掌控下将拥有新的自由,人民的人民统治也不会从地球上灭亡。


在示例推文中,您打断了单词-但在规则中,您拒绝了。请保持一致-更改规则或示例。
13年

@boothby好吧,该示例将明确显示不允许的内容...我将查看是否可以重新输入该内容。
加菲2013年

请澄清问题,以明确答案是否必须解决由于(X / Y)部分而不得不重新计算单词-> tweet包装的可能性。这使问题变得更加困难,并且在不对(X / Y)部分中的数据进行编码的情况下,对可能传达的最大消息引入了上限。
arrdem

@rmckenzie对不起,不确定我是否能找到你?您是要考虑一个事实,即您可能在一组中有100多个推文?
加菲2013年

@Gaffi-是的,是否可以将消息划分成最多的部分(在这种情况下,可以进行假设并且grc的sol有效),或者我们是否必须支持一般情况,例如序列化多个单词根据您的问题陈述所建议的一条推文。
arrdem

Answers:


12

Perl,51个字符

s#\G(.{1,132})(\s+|$)#(${\++$a}/~) $1\n#g;s#~#$a#g

需要-p命令行提示符,包括1个字符。

说明:在最多132个字符的单词组之前和之后的换行符之间插入计数部分。插入一个占位符(~为总数),然后用第二个替换代替。如果消息包含~,则会中断此消息,但您可以轻松地使用不可打印的字符来代替。

它有点作弊:计数部分始终允许七个字符(nn/nn)。确实,如果(n/n)允许,则应允许两个额外的字符。但是,对此的任意解决方案将大大增加问题的复杂性。


\G在这里没有用,不是吗?
user2846289 2014年

26

蟒蛇140

^ 140个字符实际上是一个巧合。

def f(s):
 s=s.split();i=0;l=[]
 while s:
  i+=1;t='(%d/%%d)'%i
  while s and len(t+s[0])<140:t+=' '+s.pop(0)
  l+=[t]
 for t in l:print t%i

当剩下单词时,该解决方案将从单词的提供中创建新的推文,并将它们添加到列表中。对于每个推文,它将继续尝试添加单词,直到该推文的长度超过140个字符。为保留了两个字符total tweets,稍后将在打印列表中的每个tweet时填写。

输出示例:

(1/11) FOUR SCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION CONCEIVED IN LIBERTY AND DEDICATED TO THE
(2/11) PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR TESTING WHETHER THAT NATION OR ANY NATION SO
(3/11) CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF THAT
(4/11) FIELD AS A FINAL RESTING PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND PROPER
(5/11) THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE WE CAN NOT DEDICATE, WE CAN NOT CONSECRATE, WE CAN NOT HALLOW, THIS GROUND. THE BRAVE
(6/11) MEN LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL LITTLE NOTE NOR
(7/11) LONG REMEMBER WHAT WE SAY HERE BUT IT CAN NEVER FORGET WHAT THEY DID HERE. IT IS FOR US THE LIVING RATHER TO BE DEDICATED HERE TO THE
(8/11) UNFINISHED WORK WHICH THEY WHO FOUGHT HERE HAVE THUS FAR SO NOBLY ADVANCED. IT IS RATHER FOR US TO BE HERE DEDICATED TO THE GREAT
(9/11) TASK REMAINING BEFORE US, THAT FROM THESE HONORED DEAD WE TAKE INCREASED DEVOTION TO THAT CAUSE FOR WHICH THEY GAVE THE LAST FULL
(10/11) MEASURE OF DEVOTION, THAT WE HERE HIGHLY RESOLVE THAT THESE DEAD SHALL NOT HAVE DIED IN VAIN, THAT THIS NATION UNDER GOD SHALL HAVE
(11/11) A NEW BIRTH OF FREEDOM, AND THAT GOVERNMENT OF THE PEOPLE BY THE PEOPLE FOR THE PEOPLE SHALL NOT PERISH FROM THE EARTH.

正是我所需要的。这个解决方案很酷。感谢@grc
iChux 2014年

7

Ruby,77个字符

f=->t{i=0;$><<t.gsub(/(.{1,132})([ \n]|$)/m){"(#{i+=1}/%{i}) #{$1}\n"}%{i:i}}

将逻辑打包到单个正则表达式中。输出f[text]

(1/11) FOUR SCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION CONCEIVED IN LIBERTY AND DEDICATED TO THE
(2/11) PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR TESTING WHETHER THAT NATION OR ANY NATION SO
(3/11) CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF
(4/11) THAT FIELD AS A FINAL RESTING PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND
(5/11) PROPER THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE WE CAN NOT DEDICATE, WE CAN NOT CONSECRATE, WE CAN NOT HALLOW, THIS GROUND. THE
(6/11) BRAVE MEN LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL LITTLE
(7/11) NOTE NOR LONG REMEMBER WHAT WE SAY HERE BUT IT CAN NEVER FORGET WHAT THEY DID HERE. IT IS FOR US THE LIVING RATHER TO BE DEDICATED
(8/11) HERE TO THE UNFINISHED WORK WHICH THEY WHO FOUGHT HERE HAVE THUS FAR SO NOBLY ADVANCED. IT IS RATHER FOR US TO BE HERE DEDICATED TO
(9/11) THE GREAT TASK REMAINING BEFORE US, THAT FROM THESE HONORED DEAD WE TAKE INCREASED DEVOTION TO THAT CAUSE FOR WHICH THEY GAVE THE
(10/11) LAST FULL MEASURE OF DEVOTION, THAT WE HERE HIGHLY RESOLVE THAT THESE DEAD SHALL NOT HAVE DIED IN VAIN, THAT THIS NATION UNDER GOD
(11/11) SHALL HAVE A NEW BIRTH OF FREEDOM, AND THAT GOVERNMENT OF THE PEOPLE BY THE PEOPLE FOR THE PEOPLE SHALL NOT PERISH FROM THE EARTH.

3

红宝石,75岁

无法胜过Perl,但至少可以胜过其他Ruby解决方案。请注意,它以相反的顺序打印推文(问题未指定)。

f=->t,i=1{t=~/\S.{,130}\S(?!\S)/?puts("(#{i}/%d) #$&"%n=f[$',i+1])||n :i-1}

1
嗯...我想我没有指定顺序。劫持规则的好工作。;-)
加菲2013年

1

VBA,251

尝试了另一种方法...虽然不如我的原始方法好,但我仍在努力中...

Sub a(s)
Dim n(99)
m=1
r=Split(StrConv(s,64),Chr(0))
For i=0 To Len(s)
If i-g>132 Then n(m)=Mid(s,g+1,u-g):i=u:g=i:m=m+1
If r(i)=" " Or r(i)=vbCr Then i=i+1:u=i
Next
n(m)=Mid(s,g+1)
For o=1 To m
Debug.Print "(" & o & "/" & m & ") " & n(o)
Next
End Sub

输出:

(1/11) FOUR SCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION CONCEIVED IN LIBERTY AND DEDICATED TO THE 
(2/11) PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR TESTING WHETHER THAT NATION OR ANY NATION SO 
(3/11) CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF 
(4/11) THAT FIELD AS A FINAL RESTING PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND 
(5/11) PROPER THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE WE CAN NOT DEDICATE, WE CAN NOT CONSECRATE, WE CAN NOT HALLOW, THIS GROUND. THE 
(6/11) BRAVE MEN LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL LITTLE 
(7/11) NOTE NOR LONG REMEMBER WHAT WE SAY HERE BUT IT CAN NEVER FORGET WHAT THEY DID HERE. IT IS FOR US THE LIVING RATHER TO BE DEDICATED 
(8/11) HERE TO THE UNFINISHED WORK WHICH THEY WHO FOUGHT HERE HAVE THUS FAR SO NOBLY ADVANCED. IT IS RATHER FOR US TO BE HERE DEDICATED TO 
(9/11) THE GREAT TASK REMAINING BEFORE US, THAT FROM THESE HONORED DEAD WE TAKE INCREASED DEVOTION TO THAT CAUSE FOR WHICH THEY GAVE THE 
(10/11) LAST FULL MEASURE OF DEVOTION, THAT WE HERE HIGHLY RESOLVE THAT THESE DEAD SHALL NOT HAVE DIED IN VAIN, THAT THIS NATION UNDER GOD 
(11/11) SHALL HAVE A NEW BIRTH OF FREEDOM, AND THAT GOVERNMENT OF THE PEOPLE BY THE PEOPLE FOR THE PEOPLE SHALL NOT PERISH FROM THE EARTH.

1

重击(88字符)

fold -132 -s |tac|cat -n |tac|awk '{if(NR==1)a=$1;$1="";printf "(%d/%d) %s\n",NR,a,$0 }'

在空格(-s)上以132个字符折叠行(以允许我们的推文计数),向后阅读文本(tac),对文本进行编号(cat -n),再反向(tac)。在Awk内部:第一行(NR == 1),在第一行中为字母'a'赋值。空白数字列。打印(NR /'a'),然后打印该行。

输出:

(1/12)  FOUR SCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION CONCEIVED IN LIBERTY AND DEDICATED TO THE
(2/12)  PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR TESTING WHETHER THAT NATION OR ANY NATION SO
(3/12)  CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF
(4/12)  THAT FIELD AS A FINAL RESTING PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND
(5/12)  PROPER THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE WE CAN NOT DEDICATE, WE CAN NOT CONSECRATE, WE CAN NOT HALLOW, THIS GROUND.
(6/12)  THE BRAVE MEN LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL
(7/12)  LITTLE NOTE NOR LONG REMEMBER WHAT WE SAY HERE BUT IT CAN NEVER FORGET WHAT THEY DID HERE. IT IS FOR US THE LIVING RATHER TO BE
(8/12)  DEDICATED HERE TO THE UNFINISHED WORK WHICH THEY WHO FOUGHT HERE HAVE THUS FAR SO NOBLY ADVANCED. IT IS RATHER FOR US TO BE HERE
(9/12)  DEDICATED TO THE GREAT TASK REMAINING BEFORE US, THAT FROM THESE HONORED DEAD WE TAKE INCREASED DEVOTION TO THAT CAUSE FOR WHICH
(10/12)  THEY GAVE THE LAST FULL MEASURE OF DEVOTION, THAT WE HERE HIGHLY RESOLVE THAT THESE DEAD SHALL NOT HAVE DIED IN VAIN, THAT THIS
(11/12)  NATION UNDER GOD SHALL HAVE A NEW BIRTH OF FREEDOM, AND THAT GOVERNMENT OF THE PEOPLE BY THE PEOPLE FOR THE PEOPLE SHALL NOT PERISH
(12/12)  FROM THE EARTH.

我相信会有一个较短的bash命令,该命令滥用了wc -l
Pureferret 2014年

1

Javascript(仅FF),92个字符

r=(s)=>{s.match(/.{1,132}(\s|$)/gm).map((v,i,a)=>console.log(v,'('+(i+1)+'/'+a.length+')'))}

格式化后,这就是Perl脚本的摘要:

r=(s)=>{
    s.match(/.{1,132}(\s|$)/gm).map((v,i,a) => console.log(v,'('+(i+1)+'/'+a.length+')'))
}

0

VBA,227

Sub a(s)
Dim n(99)
m=1
x=133
While Len(s)>x
t=Left(s,x):p=InStrRev(t," "):q=InStrRev(t,vbCr):i=IIf(p>q,p,q):t=Left(s,i):s=Mid(s,i+1):n(m)=t:m=m+1
Wend
n(m)=s
For o=1 To m
Debug.Print "(" & o & "/" & m & ") " & n(o)
Next
End Sub

输出:

(1/11) FOUR SCORE AND SEVEN YEARS AGO OUR FATHERS BROUGHT FORTH ON THIS CONTINENT A NEW NATION CONCEIVED IN LIBERTY AND DEDICATED TO THE 
(2/11) PROPOSITION THAT ALL MEN ARE CREATED EQUAL. NOW WE ARE ENGAGED IN A GREAT CIVIL WAR TESTING WHETHER THAT NATION OR ANY NATION SO 
(3/11) CONCEIVED AND SO DEDICATED CAN LONG ENDURE. WE ARE MET ON A GREAT BATTLEFIELD OF THAT WAR. WE HAVE COME TO DEDICATE A PORTION OF 
(4/11) THAT FIELD AS A FINAL RESTING PLACE FOR THOSE WHO HERE GAVE THEIR LIVES THAT THAT NATION MIGHT LIVE. IT IS ALTOGETHER FITTING AND 
(5/11) PROPER THAT WE SHOULD DO THIS. BUT IN A LARGER SENSE WE CAN NOT DEDICATE, WE CAN NOT CONSECRATE, WE CAN NOT HALLOW, THIS GROUND. THE 
(6/11) BRAVE MEN LIVING AND DEAD WHO STRUGGLED HERE HAVE CONSECRATED IT FAR ABOVE OUR POOR POWER TO ADD OR DETRACT. THE WORLD WILL LITTLE 
(7/11) NOTE NOR LONG REMEMBER WHAT WE SAY HERE BUT IT CAN NEVER FORGET WHAT THEY DID HERE. IT IS FOR US THE LIVING RATHER TO BE DEDICATED 
(8/11) HERE TO THE UNFINISHED WORK WHICH THEY WHO FOUGHT HERE HAVE THUS FAR SO NOBLY ADVANCED. IT IS RATHER FOR US TO BE HERE DEDICATED TO 
(9/11) THE GREAT TASK REMAINING BEFORE US, THAT FROM THESE HONORED DEAD WE TAKE INCREASED DEVOTION TO THAT CAUSE FOR WHICH THEY GAVE THE 
(10/11) LAST FULL MEASURE OF DEVOTION, THAT WE HERE HIGHLY RESOLVE THAT THESE DEAD SHALL NOT HAVE DIED IN VAIN, THAT THIS NATION UNDER GOD 
(11/11) SHALL HAVE A NEW BIRTH OF FREEDOM, AND THAT GOVERNMENT OF THE PEOPLE BY THE PEOPLE FOR THE PEOPLE SHALL NOT PERISH FROM THE EARTH.

0

Javascript(仅FF),135个字符

n=(s)=>{for(g=[],i=1,a=s.split(/(\s)/),r='';c=a.shift();g[i]=r+=c)if((c+r)[132]&&i++)r='';g.map((v,k)=>console.log(v,'('+k+'/'+i+')'))}

格式化为:

n=(s)=>{
    for (g=[],i=1,a=s.split(/(\s)/),r=''; c=a.shift(); g[i]=r+=c) {
        if((c+r)[132]&&i++) {
            r='';
        }
    }
    g.map((v,k)=>console.log(v,'('+k+'/'+i+')'))
}

我认为这比我的简短回答更聪明
不是查尔斯(Charles)

0

PHP 233

我以为这是第一个不会欺骗计数部分的答案是否正确?
(它还可以处理超过99条推文;如果在这种情况下允许无限循环,我可以再减少两个字节。)

function t($s,$e=1){$a=explode(' ',$s);while($a){$t=++$n;while($a&&strlen($t.$a[0])<137-$e)$t.=' '.array_shift($a);$r[]=$t;}if($n>=10**$e)t($s,$e+1);else foreach($r as$i=>$s)echo preg_replace('%(^\d+)%',"(\$1/$n)",$s),'
';}

松开

function t($s,$e=1)
{
    $a=explode(' ',$s);
    while($a)
    {
        $t=++$n;
        while($a&&strlen($t.$a[0])<137-$e)$t.=' '.array_shift($a);
        $r[]=$t;
    }
    if($n>=10**$e)                  // if tweet count has more than $e digits
        t($s,ceil(log10($n+1)));    // use correct length (golfed: try with length+1)
    else
        foreach($r as$i=>$s)
            echo preg_replace('%(^\d+)%',"(\$1/$n)",$s),"\n";
}
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.