替换除最后x次出现以外的字符

9

我有一个文件，其中有一堆与IP相关的主机名看起来像这样：

x-cluster-front-1 192.168.1.2
x-cluster-front-2 192.158.1.10
y-cluster-back-1 10.1.11.99
y-cluster-back-2 10.1.157.38
int.test.example.com 59.2.86.3
super.awesome.machine 123.234.15.6

我希望它看起来像这样：

x-cluster-front-1 192.168.1.2
x-cluster-front-2 192.158.1.10
y-cluster-back-1 10.1.11.99
y-cluster-back-2 10.1.157.38
int-test-example-com 59.2.86.3
super-awesome-machine 123.234.15.6

我该如何更换。为了方便第二列进行排序，在第一列中使用-（连字符）来表示（点）？我当时想用sed替换点直到第一个空格，或者替换除最后三个点以外的所有点，但是我在理解正则表达式和sed时遇到了麻烦。我可以执行简单的替换，但是这很麻烦！

这是我一直用bash编写的较大脚本的一部分。我被困在这一部分。

text-processing sed regular-expression

— 弗洛林
source

7

您可以使用AWK

awk '{gsub(/-/,".",$1);print}' infile

说明

awk默认情况下在空白处分割一行。因此，该行（第一列$1中awk-ese）将要执行的替代的一个。为此，您可以使用：

 gsub(regex,replacement,string)

执行所需的替换。

请注意，gsub只有gawk，nawk但在许多现代发行版中才支持awk的软链接gawk。

— 拉胡尔·帕蒂尔（Rahul Patil）
source

1

+1击败我。我认为，做出的解释也将真正使问问者和未来的读者受益。

— Joseph R.

1

@JosephR。对不起，我不擅长解释，但我试图和更新..

— 拉胡尔·帕蒂尔

2

的POSIX规范awk基于nawk，因此所有现代awk实现都应具有gsub。在Solaris上，您可能需要/usr/xpg4/bin/awk或nawk。

— 斯特凡Chazelas

@RahulPatil如果您不介意的话，我添加了几行对其他人有帮助。

— Joseph R.

@JosephR谢谢..，现在看来似乎很完美.. :)

— Rahul Patil

6

如果您需要在第一个字段上进行替换，最好是使用Rahul的awk解决方案，但请注意，这可能会影响间距（字段之间用单个空格重写）。

您可以通过编写它来避免它：

perl -pe 's|\S+|$&=~tr/./-/r|e' file

该-p标志的意思是“逐行读取输入文件，并在应用由-e” 给出的脚本后打印每一行。然后，替代（s|pattern|replacement|）的非空格字符的第一序列（\S+）与匹配的图案（$&代所有之后）.用-。诀窍是s|||e在e运算符将表达式作为替换值的地方使用。因此，您可以将一个替换（tr/./-/）应用于$&上一个（s|||e）的match （）。

如果您需要.用-最后3个最后一个除外，请用GNU sed并假设您有一个rev命令来代替每个，

rev file | sed 's/\./-/4g' | rev

— 斯特凡·查泽拉斯
source

1

请注意，Perl解决方案假定使用5.14或更高版本（/r才能运行）。

— Joseph R.

3

Sed不是最简单的工具-可以找到其他答案以获得更好的工具-但这是可以做到的。

要更换.由-只到第一空间，使用s一个循环。

sed -e '
  : a                     # Label "a" for the branching command
  s/^\([^ .]*\)\./\1-/    # If there is a "." before the first space, replace it by "-"
  t a                     # If the s command matched, branch to a
'

（请注意，某些sed实现不支持在同一行上添加注释。GNUsed可以。）

而是执行替换直到最后一个空格：

sed -e '
  : a                     # Label "a" for the branching command
  s/\.\(.* \)/-\1/        # If there is a "." before the last space, replace it by "-"
  t a                     # If the s command matched, branch to a
'

另一种技术是利用sed的容纳空间。将您不想修改的位保存到保留空间中，进行工作，然后调用保留空间。在这里，我在最后一个空格处分割线，并在第一部分中将点替换为破折号。

sed -e '
  h           # Save the current line to the hold space
  s/.* / /    # Remove everything up to the last space
  x           # Swap the work space with the hold space
  s/[^ ]*$//  # Remove everything after the last space
  y/./-/      # Replace all "." by "-"
  G           # Append the content of the hold to the work space
  s/\n//      # Remove the newline introduced by G
'

— 吉尔斯（Gillles）
source

2

既然Rahul 为您提供了用例的规范答案，我想我会努力回答名义问题：用正则表达式的最后x次出现代替所有其他出现：

perl -pe '
    $count = tr{.}{.}; # Count '.' on the current line
    $x = 3;
    next LINE if $count <= $x;
    while(s{\.}{-}){   # Substitute one '.' with a '-'
        last if ++$i == $count - $x # Quit the loop before the last x substitutions
    }
$i = 0
' your_file

上面的代码（经过测试）不假定您具有空格分隔的字段。它将用破折号代替一行中的所有点，最后三个点除外。3根据您的喜好替换代码中的。

— 约瑟夫·R
source

2

您可以为此使用许多不同的工具。拉胡尔·帕蒂尔（Rahul Patil）已经给了你gawk一个，这里还有其他一些：

佩尔
```
perl -lane  '$F[0]=~s/\./-/g; print "@F"' file
```
该-a开关使perl自动在空白处分割输入行，并将结果字段保存到数组中@F。因此，第一个字段为，因此$F[0]我们用第一个字段替换（s///）中所有出现的.with -，然后打印整个数组。
贝壳
```
 while read -r a b; do printf "%s %s\n" "${a//./-}" "$b"; done < file 
```
在这里，while循环读取文件并在空白处自动分割，这将创建两个字段$first和$rest。该结构${first//pattern/replacement}取代了所有出现pattern用replacement。

— Terdon
source

+1虽然perlrun(1)会告诉您-a“自动拆分模式”，但我更喜欢将其视为“ awk模式”：D

— 约瑟夫·R。

2

我相信这比一个大的讨厌的正则表达式更容易阅读。基本上，我只是将行在空白处分成两个字段，并在第一部分中使用sed。

while read -r host ip; do
    echo "$(sed 's/\./-/g' <<< "$host") $ip"
done < input_file

根据您的外壳，您还可以使用$ {host //./-}代替sed命令。

— 马多克斯
source

0

sed 's/\./-/' <file name>

无需g在命令末尾使用，您可以执行此操作……这将仅替换模式的第一次出现

— 苏南丹
source