10

如何使用sed命令替换文件中字符串的第三次出现。

例：

仅更改的第三次出现is到us的文件中。

我的输入文件包含：

hai this is linux.
hai this is unix.
hai this is mac.
hai this is unchanged.

我期望输出是：

hai this is linux.
hai thus is unix.
hai this is mac.
hai this is unchanged.

text-processing sed perl

— 苏雷什库马尔
source

3

输入和输出相同。

— Hauke Laging 2015年

4

sed不是完成这项工作的正确工具。

— choroba 2015年

@don_crissti我修复了它。OP没有使用过格式化工具（顺便说一下，Sureshkumar，请参见此处，以获取有关您编辑问题的帮助），并且随后的编辑者误解了所需的内容。

— terdon

11

使用可以轻松得多perl。

要更改^第三次出现：

perl -pe 's{is}{++$n == 3 ? "us" : $&}ge'

要更改第3 ^次出现：

perl -pe 's{is}{++$n % 3 ? $& : "us"}ge'

— StéphaneChazelas
source

3

如果替换字符串每行仅出现一次，则可以组合使用不同的实用程序。
当输入位于文件“ input”中，并且要用“ us”替换“ is”时，可以使用

LINENR=$(cat input | grep -n " is " | head -3 | tail -1 | cut -d: -f1)
cat input | sed ${LINENR}' s/ is / us /'

— 沃尔特·A
source

在问题的示例中，is每行有多个。

— terdon

我以为您在寻找带空格的“是”。我可以像使用@jimmij一样使用tr命令编辑答案，但是我的解决方案不如他。

— Walter A

我不是提问者:)。我认为同样的事情，这就是为什么我已经upvoted你的答案，但如果你看问题的原始版本（点击“编辑X分钟前”链接），你会看到OP预期的是在此变成这样。顺便说一句，那里没有猫的需要。

— terdon

2

下面的脚本（使用GNU sed语法）可用于就地编辑，而不用于输出，因为它会在所需的替换后停止打印行：

sed -i '/is/{: 1 ; /\(.*is\)\{3\}/!{N;b1} ; s/is/us/3 ; q}' text.file

如果您喜欢choroba的决定，则可以在上方进行修改

sed '/is/{:1 ; /\(.*is\)\{3\}/!{N;b1} ; s/is/us/3 ; :2 ; n ; $!b2}' text.file

输出所有行

或者您必须将所有行都放在模式空间中（在内存中，因此请注意大小限制）并进行替换

sed ': 1 ; N ; $!b1 ; s/is/us/3 ' text.file

— 科斯塔斯
source

2

sed如果以前的换行符被替换为其他任何字符，则可以使用该字符，例如：

tr '\n' '\000' | sed 's/is/us/3' | tr '\000' '\n'

与pure（GNU）相同sed：

sed ':a;N;$!ba;s/\n/\x0/g;s/is/us/3;s/\x0/\n/g'

_{（sed从https://stackoverflow.com/a/1252191/4488514偷偷换行替换）}

— 吉米吉
source

如果要使用sed特定于GNU的语法，则最好使用sed -z 's/is/us/3'。

— 斯特凡Chazelas

@StéphaneChazelas -z必须是某些全新功能，我GNU sed version 4.2.1对此选项一无所知。

— jimmij 2015年

1

在4.2.2（2012）中添加。在第二个解决方案中，您无需转换为\x0步骤。

— 斯特凡Chazelas

对不起，编辑。我没有看到问题的原始版本，有人误解了该问题，并编辑了错误的行。我恢复到以前的版本。

— terdon

1

p='[:punct:]' s='[:space:]'
sed -Ee'1!{/\n/!b' -e\}            \
     -e's/(\n*)(.*)/ \2 \1/'       \
     -e"s/is[$p]?[$s]/\n&/g"       \
     -e"s/([^$s])\n/\1/g;1G"       \
-e:c -e"s/\ni(.* )\n{3}/u\1/"      \
     -e"/\n$/!s/\n//g;/\ni/G"      \
     -e's//i/;//tc'                \
     -e's/^ (.*) /\1/;P;$d;N;D'

那一点sed恰好承载着is从一行到下一行的出现次数。它应该可靠地处理is每行的es，并且它不需要同时缓冲旧行-它只为is遇到的每一个字符保留一个换行符，这不是另一个单词的一部分。

结果是它将仅修改文件中的第三次出现-并且每行将进行计数。因此，如果文件看起来像：

1. is is isis
2. is does

...它将打印...

1. is is isis
2. us does

它首先通过在每行的开头和结尾处插入一个空格来处理边缘情况。这使单词边界更容易确定。

接下来，它is通过\n在所有出现的数字前is紧跟零个或一个标点字符后跟一个空格的位置插入前划线，来寻找有效的es 。它会进行另一遍处理，并删除所有\n紧跟在前面的非空格字符的后缀。留下的标记将匹配is.，is但不匹配this或?is。

接下来，它将每个标记收集到字符串的尾部-对于\ni一行中的每个匹配项，它将\n在字符串的尾部附加一条ewline，并将其替换为ior或u。如果\n在字符串的尾部连续收集了3 条线，则使用u-否则使用i。第一次使用au也是最后一次-替换启动了一个无限循环，最终归结为此get line, print line, get line, print line,类推。

在每个try循环周期结束时，它将清除插入的空格，仅打印到图案空间中第一个出现的换行符，然后再次执行。

我将在l循环的开头添加一个ook命令，如下所示：

l; s/\ni(.* )\n{9}/u\1/...

...并查看此输入的工作原理：

hai this is linux.
hai this is unix.


hai this is mac.
hai this is unchanged is.

...所以这是它的作用：

 hai this \nis linux. \n$        #behind the scenes
hai this is linux.               #actually printed
 hai this \nis unix. \n\n$       #it builds the marker string
hai this is unix.
  \n\n\n$                        #only for lines matching the

  \n\n\n$                        #pattern - and not otherwise.

 hai this \nis mac. \n\n\n$      #here's the match - 3 ises so far in file.
hai this us mac.                 #printed
hai this is unchanged is.        #no look here - this line is never evaled

is每行更多es 可能更有意义：

nthword()(  p='[:punct:]' s='[:space:]'         
    sed -e '1!{/\n/!b' -e\}             \
        -e 's/\(\n*\)\(.*\)/ \2 \1/'    \
        -e "s/$1[$p]\{0,1\}[$s]/\n&/g"  \
        -e "s/\([^$s]\)\n/\1/g;1G;:c"   \
        -e "${dbg+l;}s/\n$1\(.* \)\n\{$3\}/$2\1/" \
        -e '/\n$/!s/\n//g;/\n'"$1/G"    \
        -e "s//$1/;//tc" -e 's/^ \(.*\) /\1/'     \
        -e 'P;$d;N;D'
)

几乎是同一回事，但是写成带有POSIX BRE和基本参数处理。

 printf 'is is. is? this is%.0s\n' {1..4}  | nthword is us 12

...得到...

is is. is? this is
is is. is? this is
is is. is? this us
is is. is? this is

...并且如果我启用${dbg}：

printf 'is is. is? this is%.0s\n' {1..4}  | 
dbg=1 nthword is us 12

...我们可以反复观看...

 \nis \nis. \nis? this \nis \n$
 is \nis. \nis? this \nis \n\n$
 is is. \nis? this \nis \n\n\n$
 is is. is? this \nis \n\n\n\n$
is is. is? this is
 \nis \nis. \nis? this \nis \n\n\n\n\n$
 is \nis. \nis? this \nis \n\n\n\n\n\n$
 is is. \nis? this \nis \n\n\n\n\n\n\n$
 is is. is? this \nis \n\n\n\n\n\n\n\n$
is is. is? this is
 \nis \nis. \nis? this \nis \n\n\n\n\n\n\n\n\n$
 is \nis. \nis? this \nis \n\n\n\n\n\n\n\n\n\n$
 is is. \nis? this \nis \n\n\n\n\n\n\n\n\n\n\n$
 is is. is? this \nis \n\n\n\n\n\n\n\n\n\n\n\n$
is is. is? this us
is is. is? this is

— 麦克维
source

您是否意识到您的示例说“ isis”？

— flarn2006 '16

@ flarn2006-我很确定它说的是。

— mikeserv '16

0

这是一个逻辑解决方案，使用sed，tr但必须将其编写为脚本才能正常工作。下面的代码替换命令中指定的单词的第3次出现sed。替换i=3用i=n，使这项工作任何n。

码：

# replace new lines with '^' character to get everything onto a single line
tr '\n' '^' < input.txt > output.txt

# count number of occurrences of the word to be replaced
num=`grep -o "apple" "output.txt" | wc -l`

# in successive iterations, replace the i + (n-1)th occurrence
n=3
i=3
while [ $i -le $num ]
do
    sed -i '' "s/apple/lemon/${i}" 'output.txt'
    i=$(( i + (n-1) ))
done

# replace the '^' back to new line character
tr '^' '\n' < output.txt > tmp && mv tmp output.txt

工作原理：

假设文本文件是a b b b b a c a d a b b b a b e b z b s b a b。

当n = 2时，我们想替换第二个出现的b。
- a b b b b a c a d a b b b a b e b z b s b a b
  . . ^ . ^ . . . . . . ^ . . ^ . . . ^ . ^ . ^
- 首先，我们替换第2个出现，然后替换第3个出现，然后替换第4个，第5个，依此类推。按上面显示的顺序计数，自己看看吧。
当n = 3时，我们要替换的每三次出现b。
- a b b b b a c a d a b b b a b e b z b s b a b
  . . . ^ . . . . . . . ^ . . . . ^ . . . . . ^
- 首先，我们替换第三个匹配项，然后替换第5个匹配项，然后替换第7个，第9个匹配项，第11个匹配项，依此类推。
当n = 4时：我们要替换每出现三次b。
- 首先，我们替换第4个匹配项，然后替换第7个，然后替换第10个，第13个，依此类推。

— 阿格杜鲁夫
source

如何仅替换文件中第N个出现的模式？

码：

工作原理：