如何使用gsub替换完全匹配的单词


0

请参阅下面的脚本 - 在此

nawk 'NR==FNR { a[$1]=$2 ; next} {for ( i in a) gsub(i,a[i])}1' file.dat 1.txt

1.TXT

ioiufeioru
dfoiduf
MO_CIF_INP438
fjkdj MO_CIF_INP
dsjhdf BP_LINKED_TETES
dehdueuh MO_INP_BPRESP

FILE.DAT

MO_CIF_INP TRO_MO_CIF_INP

BP_LINKED_TETES TRO_BP_LINKED_TETES

MO_BPID TRO_MO_BPID

MO_INP_BPRESP TRO_MO_INP_BPRESP

以及上面脚本的输出是

ioiufeioru
dfoiduf
TRO_MO_CIF_INP438
fjkdj TRO_MO_CIF_INP
dsjhdf TRO_BP_LINKED_TETES
dehdueuh TRO_MO_INP_BPRESP

我的意图是只更换匹配的单词,并不是在上面的情况下 MO_CIF_INP438也会被替换。我们如何使用单词搜索?我试过以下情况,但没有工作

1。

nawk 'NR==FNR { a[$1]=$2 ; next} {for ( i in a) gsub(/i/,a[i])}1' file.dat 1.txt


TRO_BP_LINKED_TETESoTRO_BP_LINKED_TETESufeTRO_BP_LINKED_TETESoru
dfoTRO_BP_LINKED_TETESduf
MO_CIF_INP438
fjkdj MO_CIF_INP
dsjhdf BP_LINKED_TETES
dehdueuh MO_INP_BPRESP

2。

nawk 'NR==FNR { a[$1]=$2 ; next} {for ( i in a) gsub(\<i\>,a[i])}1' file.dat 1.txt
nawk: syntax error at source line 1
 context is
        NR==FNR { a[$1]=$2 ; next} {for ( i in a) >>>  gsub(\ <<< <i\>,a[i])}1
nawk: illegal statement at source line 1

Answers:


2

您可以尝试以下方式:

awk '

# Read entire file.dat in an array indexed at column1 having value of column2

NR==FNR { 
    a[$1]=$2; 

# Skip the next action statements until file.dat is completely stored

    next 
}

# For each index element of array

{
    for(i in a) { 

# Iterate over each values of line from 1.txt file

        for(x=1;x<=NF;x++) {

# If an exact match is found replace it with array element else leave it as is. 

            $x=(i==$x)?a[i]:$x
            }
        }
}1' file.dat 1.txt

$ head file.dat 1.txt 
==> file.dat <==
MO_CIF_INP TRO_MO_CIF_INP

BP_LINKED_TETES TRO_BP_LINKED_TETES

MO_BPID TRO_MO_BPID

MO_INP_BPRESP TRO_MO_INP_BPRESP

==> 1.txt <==
ioiufeioru
dfoiduf
MO_CIF_INP438
fjkdj MO_CIF_INP
dsjhdf BP_LINKED_TETES
dehdueuh MO_INP_BPRESP

$ awk '              
  NR==FNR { 
      a[$1]=$2; 
      next 
  }
  {for(i in a) { 
      for(x=1;x<=NF;x++) {
          $x=(i==$x)?a[i]:$x
          }
      }
  }1' file.dat 1.txt
ioiufeioru
dfoiduf
MO_CIF_INP438
fjkdj TRO_MO_CIF_INP
dsjhdf TRO_BP_LINKED_TETES
dehdueuh TRO_MO_INP_BPRESP

如果可能的话,你可以解释逻辑
蒂姆森2014年

@Timson我添加了注释来解释逻辑。希望有所帮助!
jaypal singh 2014年

我还有一个问题,在我的文件中,单词分隔可以是空格或“,”或者它可以是“:”例如,如果我的文件a.txt包含类似的模式 - hjdfgjf,BP_LINKED_TETES,fkdsjfj:BP_LINKED_TETES jdfhdj必须更换为hjdfgjf,TRO_BP_LINKED_TETES,fkdsjfj:TRO_BP_LINKED_TETES jdfhdj,目前尚未更换。能否请你再次帮助我
蒂姆森2014年
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.