在sed中使用表或脚本用转义字符替换许多特殊字符?


1

如果你想使用sed替换特殊字符,你可以使用不同的方法,但问题是你必须在许多文件中用转义字符替换许多(100+)特殊字符。

所以它需要:(谢谢彼得)

^^ 逃避一个人 ^
^| 逃离 |
\& 逃离 &
\/ 逃离 /
\\ 逃避 \

假设在许多文件中有100多个字符串示例:

sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
sed.exe -i "s/{\*)(//123/
sed -i "s/\\/123/g;" 1.txt
.....
.....

这些字符串包含许多要转义的特殊字符(我们有100多个字符串)..
手动转义是一项非常漫长的工作。所以我需要创建一个类似于的表脚本 wReplace 在命令提示符中调用转义特殊字符,然后用我的单词替换它们。
我能怎么做?

Answers:


2

注意 ^^ 对于 ^,和 ^| 对于 |,和 ^& 对于 &......不是要求 sed。该 ^ 逃脱字符 是CMD-shell所必需的。如果您的文本既不暴露于命令行也不暴露于.cmd / .bat命令脚本中的命令参数,您只需要考虑 SED的 escape-character这是一个反斜杠 \​ ......它们是两个相当独立的范围(可以重叠,因此通常更好地将它全部保留在sed的范围内,如下所示。

这里有一个 sed 脚本将替换任意数量的 发现串 你指定,与他们互补 替换字符串 。字符串的一般格式是a之间的交叉 sed 替换命令( S / ABC / XYZ / P )和表格格式。您可以“拉伸”中间分隔符,以便您可以排列。
您可以使用FIXED字符串模式( F/... ),或正常 SED式 正则表达式( S / ... )...你可以调整 sed -n 每一个 /p (根据需要在table.txt中)。

最小运行需要3个文件(第4个,从table.txt动态派生):

  1. 主要脚本 表到regex.sed
  2. 表文件 table.txt
  3. 目标文件 文件到change.text
  4. 派生脚本 表derived.sed

针对一个目标文件运行一个表。

sed -nf table-to-regex.sed  table.txt > table-derrived.sed
# Here, check `table-derrived.sed` for errors as described in the example *table.txt*.  

sed -nf table-derrived.sed  file-to-change.txt
# Redirect *sed's* output via `>` or `>>` as need be, or use `sed -i -nf` 

如果你想跑 table.txt 针对许多文件,只需将上面的代码段放入一个简单的循环中即可满足您的要求。我可以琐碎地做到这一点 庆典 ,但更多人知道Windows CMD-shell比我设置它更适合。


这是脚本: 表到regex.sed

s/[[:space:]]*$//  # remove trailing whitespace

/^$\|^[[:space:]]*#/{p; b}  # empty and sed-style comment lines: print and branch
                            # printing keeps line numbers; for referencing errors

/^\([Fs]\)\(.\)\(.*\2\)\{4\}/{  # too many delims ERROR
      s/^/# error + # /p        # print a flagged/commented error
      b }                       # branch

/^\([Fs]\)\(.\)\(.*\2\)\{3\}/{                  # this may be a long-form 2nd delimiter
   /^\([Fs]\)\(.\)\(.*\2[[:space:]]*\2.*\2\)/{  # is long-form 2nd delimiter OK?
      s/^\([Fs]\)\(.\)\(.*\)\2[[:space:]]*\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                                      # branch on true to :OK
   }; s/^/# error L # /p                        # print a flagged/commented error
      b }                                       # branch: long-form 2nd delimiter ERROR

/^\([Fs]\)\(.\)\(.*\2\)\{2\}/{     # this may be short-form delimiters
   /^\([Fs]\)\(.\)\(.*\2.*\2\)/{   # is short-form delimiters OK?
      s/^\([Fs]\)\(.\)\(.*\)\2\(.*\)\2\(.*\)/\1\2\n\3\n\4\n\5/
      t OK                         # branch on true to :OK  
   }; s/^/# error S # /p           # print a flagged/commented error
      b }                          # branch: short-form delimiters ERROR

{ s/^/# error - # /p        # print a flagged/commented error
  b }                       # branch: too few delimiters ERROR

:OK     # delimiters are okay
#============================
h   # copy the pattern-space to the hold space

# NOTE: /^s/ lines are considered to contain regex patterns, not FIXED strings.
/^s/{    s/^s\(.\)\n/s\1/   # shrink long-form delimiter to short-form
     :s; s/^s\(.\)\([^\n]*\)\n/s\1\2\1/; t s  # branch on true to :s 
      p; b }                                  # print and branch

# The following code handles FIXED-string /^F/ lines

s/^F.\n\([^\n]*\)\n.*/\1/  # isolate the literal find-string in the pattern-space
s/[]\/$*.^|[]/\\&/g        # convert the literal find-string into a regex of itself
H                          # append \n + find-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

s/^F.\n[^\n]*\n\([^\n]*\)\n.*/\1/  # isolate the literal repl-string in the pattern-space
s/[\/&]/\\&/g                      # convert the literal repl-string into a regex of itself
H                                  # append \n + repl-regex to the hold-space

g   # Copy the modified hold-space back into the pattern-space

# Rearrange pattern-space into a / delimited command: s/find/repl/...      
s/^\(F.\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)\n\([^\n]*\)$/s\/\5\/\6\/\4/

p   # Print the modified find-and-replace regular expression line

这是一个示例表文件,其中包含如何工作的说明: table.txt

# The script expects an input table file, which can contain 
#   comment, blank, and substitution lines. The text you are
#   now reading is part of an input table file.

# Comment lines begin with optional whitespace followed by #

# Each substitution line must start with: 's' or 'F'
#  's' lines are treated as a normal `sed` substitution regular expressions
#  'F' lines are considered to contain `FIXED` (literal) string expressions 
# The 's' or 'F' must be followed by the 1st of 3 delimiters   
#   which must not appear elsewhere on the same line.
# A pre-test is performed to ensure conformity. Lines with 
#   too many or too few delimiters, or no 's' or 'F', are flagged   
#   with the text '# error ? #', which effectively comments them out.
#   '?' can be: '-' too few, '+' too many, 'L' long-form, 'S' short-form
#   Here is an example of a long-form error, as it appears in the output. 

# error L # s/example/(7+3)/2=5/

# 1st delimiter, eg '/' must be a single character.
# 2nd (middle) delimiter has two possible forms:
#   Either it is exactly the same as the 1st delimiter: '/' (short-form)
#   or it has a double-form for column alignment: '/      /' (long-form)
#   The long-form can have any anount of whitespace between the 2 '/'s   
# 3rd delimiter must be the same as the 1st delimiter,

# After the 3rd delimiter, you can put any of sed's 
#    substitution commands, eg. 'g'

# With one condition, a trailing '#' comment to 's' and 'F' lines is
#    valid. The condition is that no delimiter character can be in the 
#    comment (delimiters must not appear elsewhere on the same line)

# For 's' type lines, it is implied that *you* have included all the 
#    necessary sed-escape characters!  The script does not add any 
#    sed-escape characters for 's' type lines. It will, however, 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# For 'F' type lines, it is implied that both strings (find and replace) 
#    are FIXED/literal-strings. The script does add the  necessary 
#    sed-escape characters for 'F' type lines. It will also 
#    convert a long-form middle-delimiter into a short-form delimiter.   

# The result is a sed-script which contains one sed-substitution 
#    statement per line; it is just a modified version of your 
#    's' and 'F' strings "table" file.

# Note that the 1st delimiter is *always* in column 2.

# Here are some sample 's' and 'F' lines, with comments:
#

F/abc/ABC/gp               #-> These 3 are the same for 's' and 'F', 
s/abc/ABC/gp               #-> as no characters need to be escaped,  
s/abc/         /ABC/gp     #-> and the 2nd delimiter shrinks to one  

F/^F=Fixed/    /\1okay/p   # \1 is okay here, It is a FIXED literal
s|^s=sed regex||\1FAIL|p   # \1 will FAIL: back-reference not defined!

F|\\\\|////|               # this line == next line 
F|\\\\|        |////|p     # this line == previous line  
s|\\\\|        |////|p     # this line is different; 's' vs 'F'

F_Hello! ^.&`//\\*$/['{'$";"`_    _Ciao!_   # literal find / replace    

以下是您要更改其文本的示例输入文件: 文件到change.text

abc abc
^F=Fixed
   s=sed regex
\\\\ \\\\ \\\\ \\\\
Hello! ^.&`//\\*$/['{'$";"`
some non-matching text

我正在努力理解..我复制你的文本,我把它们粘贴在3个文件中: 表到regex.sed table.txt 表derived.sed 。但是当我尝试运行第二个命令时 sed -nf table-derived.sed file-to-change.text CMD给了我这个错误: img152.imageshack.us/img152/30/76715330.png 我希望我通过在3个文件中逐字复制文本来做正确的事情
user143822

是的,这正是应该发生的......那条特定的线是故意制造错误的;告诉你什么 要做..阅读该行的评论。所有的线路 table.txt 有评论描述他们做了什么...这是第58行 table.txt s|^s=sed regex||\1FAIL| # \1 will FAIL: back-reference not defined! 。这条线有 || 变成 | 表derived.sed 成为: s|^s=sed regex|\1FAIL| # \1 will FAIL: back-reference not defined!
Peter.O

继续...那个特定的线是一个 s 类型,而不是 F 类型,如主脚本中所述 表到regex.sed # NOTE: /^s/ lines are considered to contain regex patterns, not FIXED strings. ...如果你提供一个糟糕的正则表达式,它将失败。该错误是因为她看到的 \1 在一个 反向引用 ,但在搜索模式中没有定义这样的后向引用。除了编写完整的正则表达式解析器之外,没有办法捕获该错误。类型不应该发生这种类型的错误 F 行,因为它们被视为FIXED(不是正则表达式)
Peter.O

1
StackExchange网站有像这样的扩展讨论的特殊聊天工具。不幸的是我不能让我的工作,但如果你想进一步讨论,你有一个IRC客户端( mIRC 是最常见的Windows),然后我们可以聊聊 irc.freenode.com (你可以输入 /connect irc.freenode.com 在mIRC的命令行上,然后在建立连接时,键入以下命令: /join #su447178 ..我现在在那里...... ci vediamo。
Peter.O

1
非常感谢..删除 p 来自table.txt和 ñ 从第二个命令
user143822
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.