有时会发生这样的情况：在输入句子时，我会分心，最终两次连续输入相同的单词两次。

为了确保其他人不会对此感到烦恼，您的任务是编写一个解决此问题的程序！

任务

给定一个输入字符串（如果对您的语言而言很重要，则可以假定不包含换行符的仅ASCII输入。）str，在其中间某处包含一个子字符串，该子字符串立即连续出现两次，并返回带有此字符串的一个实例的字符串子字符串已删除。

如果存在多种可能性，则返回可能的最短答案（即，选择最长的连续重复子串并删除该子串）。

如果有多个等长的连续重复子字符串，请删除第一个（即从前向后读取该字符串时遇到的第一个）。

您可以假设输入正确（即始终包含一个连续的重复子字符串），这可能有助于降低它的负担。

例子

输入：hello hello world->输出：hello world。
输入：foofoo->输出：foo。（因此：是的，字符串可能只包含两次重复部分）。
输入：aaaaa->输出：aaa，因为最长的重复连续子字符串在此处aa。
输入：Slartibartfast->这不是有效的输入，因为它不包含连续的重复子字符串，因此您无需处理这种情况。
输入：the few the bar->这是另一个无效的输入，因为重复部分应立即跟随原始部分。在这种情况下，the并且the之间用其他分隔符隔开，因此此输入无效。
输入：ababcbc->输出：abcbc。两个可能最长的连续重复子串是ab和bc。如ab字符串前面所述，这是正确的答案。
输入：Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo。输出：Buffalo buffalo buffalo buffalo Buffalo buffalo。（执行的替换应区分大小写）。
输入：Sometimes it happens that while typing a sentence, I am distracted and I end up typing the same couple of words twice couple of words twice in succession.->输出：Sometimes it happens that while typing a sentence, I am distracted and I end up typing the same couple of words twice in succession.。仅删除最长的连续重复子字符串。

您的代码应尽可能短，因为这是code-golf，所以以字节为单位的最短答案会获胜。祝好运！

— q
source

@manatwork当采用第一句话时，即

Sometimes it happens that while typing a sentence, I am distracted and I end up typing the same couple of words twice couple of words twice in succession.

作为输入，输出应为

Sometimes it happens that while typing a sentence, I am distracted and I end up typing the same couple of words twice in succession.

。仅删除找到的最长重复项。

— Qqwy

1

我建议添加一个具有两个可能替换项的测试，其中第二个替换项比第一个更长。我怀疑大多数答案都不会通过那个:)

— 迟

@aross测试案例8就是这样：)

— Qqwy

除非我和我的测试代码错误，否则其中只有一个重复的字符串。

— aross

@aross有一个双重p的happens

— Qqwy

8

Perl 6，40个字节

{.subst: m:ex/(.*))>$0/.max(*.chars),''}

试试吧

{
  .subst:             # substitute


    m                 # match
    :exhaustive
    /
      ( .* )          # any number of chars

      )>              # don't include the following in what is returned

      $0              # the first match again
    /.max( *.chars ), # find the first longest submatch


    ''                # substitute it with nothing
}

— 布拉德·吉尔伯特b2gills
source

8

视网膜，35 33字节

字节数假定为ISO 8859-1编码。

(?=(.+)(\1.*))
$2¶$`
O$#`
$.&
G1`

在线尝试！

说明

由于正则表达式引擎从左到右查找匹配项，因此无论位置如何，找到最长的匹配项都不是一件容易的事。可以使用.NET的平衡组来完成，但是结果相当长：

1`((.)+)\1(?<=(?!.*((?>(?<-2>.)+).+)\3)^.*)
$1

所以我想我会尝试通过利用其他一些Retina功能来避免这种情况。

(?=(.+)(\1.*))
$2¶$`

我们从本质上开始应用所有可能的替换，每行一个。为此，我们将匹配项放在匹配项的前面（而不是匹配项本身），以允许重叠匹配项。这是通过将真正的正则表达式放到前面来完成的。然后，该前瞻将捕获剩下的部分，但我们要在组2中删除的重复项除外。我们回写组2（删除重复项），换行，然后将整个输入写到匹配项，这基本上给了我们新的一行被取代。

最后，每场比赛我们只有一行，并删除了对应的重复项。最后，还将再次进行完整输入，而无需进行任何替换。

现在我们有了所有可能的替换，我们想要最短的结果（对应于移除的最长重复）。

O$#`
$.&

因此，我们首先按长度对行进行排序。

G1`

然后我们只保留第一行。

— 马丁·恩德
source

哇，这种替换技术真的很聪明！

— 狮子座

6

果冻，22 19 字节

-2个字节归功于Dennis（避免参数反转，删除微妙的冗余增量）

ẋ2³wȧ+¥J
ẆÇ€LÐṀḢṬœp

在线尝试！

完整的程序（已发现一个错误，ÐṀ无法对双子进行正确的处理，很快就会修复；尽管我不确定此处是否可以使代码更短）。

怎么样？

查找输入中最长的条带中的第一个，以使输入中存在重复项并将其从输入中删除。

ẋ2³wȧ+¥J - Link 1, removal indices for given slice if valid, else 0: slice, x
ẋ2       - repeat x twice, say y
  ³      - program input: s
   w     - index of first occurrence of y in s (1-based) or 0, say i
       J - range(length(x)): [1,2,3,...,length(x)]
      ¥  - last two links as a dyad
    ȧ    -     and (non-vectorising)
     +   -     addition: [1+i,2+i,3+i,...,length(x)+i] or 0
         - note: no need to decrement these since the last index will be the 1st index
         - of the repetition (thanks to Dennis for spotting that!)

ẆÇ€LÐṀḢṬœp - Main link: string, s
Ẇ          - all sublists of s (order is short to long, left to right, e.g. a,b,c,ab,bc,abc)
 Ç€        - call the last link (1) as a monad for €ach
    ÐṀ     - filter by maximal
   L       -     length
      Ḣ    - head: get the first (and hence left-most) one
       Ṭ   - untruth: make a list with 1s at the indexes given and 0s elsewhere
        œp - partition s at truthy indexes of that, throwing away the borders
           - implicit print

— 乔纳森·艾伦
source

6

JavaScript（ES6），81 74字节

f=
s=>s.replace(/(?=(.+)\1)/g,(_,m)=>r=m[r.length]?m:r,r='')&&s.replace(r,'')

<input oninput=o.textContent=f(this.value)><pre id=o>

展开摘要

编辑：窃取@Arnauld的m[r.length]把戏保存了7个字节。

— 尼尔
source

5

PowerShell，87字节

param($s)([regex](([regex]'(.+)\1'|% *hes $s|sort L*)[-1]|% Gr*|% V*)[1])|% Re* $s '' 1

在线尝试！（所有测试用例）

说明

基本上从内部开始，我们Matches使用(.+)\1正则表达式运行，以返回指定字符串的所有匹配对象。正则表达式匹配其后跟的任何字符序列。

然后，将生成的匹配对象通过管道sort传递给它们以其Length属性排序（简称为通配符）。这将导致匹配数组按长度排序（升序），因此使用index [-1]来获取最后一个元素（最长）。但是，该匹配项的值是匹配项，而不是组，因此它包含重复项，因此我们检索Group对象（|% Gr*），然后检索该（）的值，|% V*以获得最大的重复字符串。事情是group对象实际上是一个数组，因为group 0总是匹配的，但是我想要实际的group（1），所以结果值实际上是value s，因此进行索引以获得第二个元素[1]。将此值强制转换为正则表达式对象本身，然后Replace对原始字符串调用方法，不进行任何替换，仅替换第一个匹配项（|% Re* $s '' 1）。

— 英国主义者
source

5

Haskell，101个字节

主要功能是f，它需要并返回一个String。

l=length
a=splitAt
f s|i<-[0..l s-1]=[p++t|n<-i,(p,(r,t))<-fmap(a$l s-n).(`a`s)<$>i,r==take(l r)t]!!0

在线尝试！

当我开始，我进口Data.List和使用maximum，tails，inits和isPrefixOf。不知何故变成了这个。但是我仍然只能剃掉11个字节...

笔记

splitAt/ a在给定的索引处分割字符串。
s 是输入字符串。
i是号码列表[0 .. length s - 1]，则-1是要解决的是splitAt在结束分裂，如果给予过大索引。
n是length s减去重复部分的当前长度目标，它是通过这种方式选择的，因此我们不必使用两个数字列表和/或冗长的递减列表语法。
p，r和t是的三元拆分s，其中r包含预期的重复部分。在fmap那里使用(,) String Functor，以避免变量用于中间分裂。
!!0 选择匹配项列表的第一个元素。

— ØrjanJohansen
source

4

果冻，23 21字节

ṚẆUẋ€2ẇÐf¹ṪðLHḶ+w@Ṭœp

感谢@JonathanAllan的Ṭœp想法，它节省了2个字节。

在线尝试！

— 丹尼斯
source

4

Mathematica，63 60 59字节

^{由于Martin Ender节省了4个字节。}

#&@@StringReplaceList[#,a__~~a__->a]~SortBy~{StringLength}&

匿名函数。将字符串作为输入并返回字符串作为输出。

— 军团哺乳动物978
source

这在示例6上似乎不起作用– ~SortBy~StringLength如果字符串的长度相同

— 不是一棵树

1

@ LegionMammal978较短的解决方法是保留SortBy并包装StringLength在列表中以获得稳定的排序。

— 马丁·恩德

3

JavaScript（ES6），70个字节

s=>s.replace(s.match(/(.+)(?=\1)/g).reduce((p,c)=>c[p.length]?c:p),'')

测试用例

显示代码段

let f =

s=>s.replace(s.match(/(.+)(?=\1)/g).reduce((p,c)=>c[p.length]?c:p),'')

console.log(f("hello hello world"))
console.log(f("foofoo"))
console.log(f("aaaaa"))
console.log(f("ababcbc"))
console.log(f("Sometimes it happens that while typing a sentence, I am distracted and I end up typing the same couple of words twice couple of words twice in succession."))

展开摘要

— Arnauld
source

失败aaaabaaab，但很好用reduce。

— 尼尔

2

这应该是评论，但是我没有足够的声誉来发表评论。我只想告诉@Neil他的代码可以减少到77个字节。您无需在正则表达式中使用前向断言。这是简化版：

s=>s.replace(/(.+)\1/g,(_,m)=>(n=m.length)>l&&(l=n,r=m),l=0)&&s.replace(r,'')

— 特洛尔
source

2

您好，欢迎来到PPCG！您可以将其作为自己的JavaScript答案提交！如果您愿意，我可以编辑您的信息并向您显示外观。

— NoOneIsHere

2

我需要使用前向断言来处理重叠匹配的情况。aabab是您的建议失败的最短示例。

— 尼尔

0

C＃，169个字节

(s)=>{var x="";for(int i=0;i<s.Length-2;i++){for(int l=1;l<=(s.Length-i)/2;l++){var y=s.Substring(i,l);if(s.Contains(y+y)&l>x.Length)x=y;}}return s.Replace(x+x,x);}

说明

(s) => {                // Anonymous function declaration    
    var x = "";         // String to store the longest repeating substring found
    for (int i = 0; i < s.Length - 2; i++) {               // Loop through the input string
        for (int l = 1; l <= (s.Length - i) / 2; l++) {    // Loop through all possible substring lengths
            var y = s.Substring(i, l);
            if (s.Contains(y + y) & l > x.Length) x = y;   // Check if the substring repeats and is longer than any previously found
        }
    }
    return s.Replace(x + x, x);    // Perform the replacement
}

这是蛮力的方法：尝试所有可能的子字符串，直到找到最长的重复子字符串。无疑，Regex效率更高，但是用C＃处理Regex往往很冗长。

— Extragorey
source

欢迎来到PPCG！所有答案都必须是完整程序或可调用函数，不确定是在硬编码变量中输入的代码片段。另外，请显示您实际计算的代码版本，并删除所有不必要的空格。除了完全打高尔夫球之外，您还可以始终在凹痕中添加更具可读性的版本。

— 马丁·恩德

0

PHP，84 82字节

注意：使用IBM-850编码。

for($l=strlen($argn);--$l&&!$r=preg_filter("#(.{0$l})\g-1#",~█╬,$argn,1););echo$r;

像这样运行：

echo 'hello hello world' | php -nR 'for($l=strlen($argn);--$l&&!$r=preg_filter("#(.{0$l})\g-1#",~█╬,$argn,1););echo$r;';echo
> hello world

说明

for(
  $l=strlen($argn);   # Set $l to input length.
  --$l   &&           # Decrement $l each iteration until it becomes 0.
  !$r=preg_filter(    # Stop looping when preg_filter has a result
                      # (meaning a successful replace).
    "#(.{0$l})\g-1#", # Find any character, $l times (so the longest
                      # match is tried first), repeated twice.
    ~█╬,              # Replace with $1: first capture group, removing the
                      # duplicate.
    $argn,
    1                 # Only replace 1 match.
  );
);
echo$r;               # Print the result of the (only) successful
                      # search/replace, if any.

调整

保存了2个字节，因为重复子串没有最小长度

— ros
source

查找原始字符串，不重复，中间不重复

任务

例子

Perl 6，40个字节

视网膜，35 33字节

说明

果冻，22 19 字节

怎么样？

JavaScript（ES6），81 74字节

PowerShell，87字节

说明

Haskell，101个字节

笔记

果冻，23 21字节

Mathematica，63 60 59字节

JavaScript（ES6），70个字节

测试用例

C＃，169个字节

说明

PHP，84 82字节

说明

调整