如何获取第n列包含第m列的行


9

我有一个包含域和网络邮件的CSV文件,如下所示:

site1.com,mail.site1.com
site2.com,testmail.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site5.com,foomail.com
site6.com,barmail.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site9.com,foobarmail.com
site10.com,mx-smtp222.site10.com

我想获取其中webmails列包含同一行的domains列的行。对于上面的示例,输出应为:

site1.com,mail.site1.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site10.com,mx-smtp222.site10.com

Answers:


11

awk

awk -F, '$2 ~ $1"$"' file.csv
  • -F, 将字段分隔符设置为 ,

  • $2 ~ $1"$"测试第二个字段是否以第一个字段结尾;如果是这样,请打印记录(默认操作)


使用grepgrep默认情况下仅打印匹配的行:

grep -E '^([^,]+),.*\1$' file.csv

使用sed,符合条件的打印行:

sed -nE '/^([^,]+),.*\1$/ p' file.csv

范例

% cat file.txt
site1.com,mail.site1.com
site2.com,testmail.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site5.com,foomail.com
site6.com,barmail.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site9.com,foobarmail.com
site10.com,mx-smtp222.site10.com

% awk -F, '$2 ~ $1"$"' file.txt
site1.com,mail.site1.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site10.com,mx-smtp222.site10.com

% grep -E '^([^,]+),.*\1$' file.txt
site1.com,mail.site1.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site10.com,mx-smtp222.site10.com


% sed -nE '/^([^,]+),.*\1$/ p' file.txt 
site1.com,mail.site1.com
site3.com,mx.site3.com
site4.com,smtp.site4.com
site7.com,webmail.site7.com
site8.com,01mx.site8.com
site10.com,mx-smtp222.site10.com
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.