Grep比赛前后的字符？

144

使用这个：

grep -A1 -B1 "test_pattern" file

将在文件中的匹配模式前后产生一行。有没有一种方法来显示不是行而是指定数量的字符？

我文件中的行很大，所以我对打印整行不感兴趣，而只是观察上下文中的匹配项。有关如何执行此操作的任何建议？

bash grep

— 传说
source

1

重复的unix.stackexchange.com/q/163726近复制的stackoverflow.com/q/2034799

— sondra.kinsey

183

前3个字符，后4个字符

$> echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}'
23_string_and

— ДМИТРИЙМАЛИКОВ
source

5

对于少量数据，这是一个很好的答案，但是当您匹配> 100个字符时，它开始变慢-例如，在我的巨型xml文件中，我想要前后为{1,200}，并且使用起来太慢了。

— Benubird13年

3

@amit_g的awk版本要快得多。

— ssobczak

6

在Mac OSX上不可用，因此实际上这不是一个广泛可用的解决方案。-E版本（下面列出）是更好的解决方案。什么是-P？继续阅读... -P，--perl-regexp将PATTERN解释为Perl正则表达式（PCRE，请参见下文）。这是高度实验性的，grep -P可能会警告未实现的功能。

— Xofo 2014年

2

在OSX上，通过：安装brew install homebrew/dupes/grep并以方式运行ggrep。

— kenorb 2015年

1

正如@Benubird所暗示的那样，对于匹配目标所需的适度宽广环境的大型文件，在性能上将无法使用。

— matanster

113

grep -E -o ".{0,5}test_pattern.{0,5}" test.txt

图案前后最多可以匹配5个字符。-o开关告诉grep仅显示匹配项，-E则使用扩展的正则表达式。确保在表达式周围加上引号，否则外壳可能会解释它。

— 埃克塞
source

1

很好的答案，有趣的是，它在{}中的长度上限为2 ^ 8-1，因此{0,255}可以{0,256}给出grep: invalid repetition count(s)

— CodeMonkey '18

当我增加匹配字符的数量（5-> 25-> 50）时，性能似乎大大降低，为什么？

— 亚当·休斯

37

你可以用

awk '/test_pattern/ {
    match($0, /test_pattern/); print substr($0, RSTART - 10, RLENGTH + 20);
}' file

— 阿米特克
source

2

即使较大的文件也可以很好地工作

— Touko

4

您如何使用它来查找每行多个匹配项？

— koox00 '16

1

花括号对中的第一个数字的意义是什么？就像“ grep -E -o”中的0。{0,5} test_pattern。{0,5}“ test.txt”？

— 刘·洛克威尔

确实更快，但不如@ekse的答案准确。

— 阿卜杜拉

24

您的意思是这样的：

grep -o '.\{0,20\}test_pattern.\{0,20\}' file

？

最多可在的两边打印20个字符test_pattern。该\{0,20\}标记是一样*的，但指定零到二十重复，而不是零或more.The -o说，只显示了比赛本身，而不是整条生产线。

— 鲁阿赫
source

该命令对我不起作用：grep: Invalid content of \{\}

— Alexander Pravdin

0

使用gawk，您可以使用匹配功能：

    x="hey there how are you"
    echo "$x" |awk --re-interval '{match($0,/(.{4})how(.{4})/,a);print a[1],a[2]}'
    ere   are

如果您可以使用perl，则可以使用更灵活的解决方案：以下命令将在模式之前打印三个字符，然后输出实际模式，然后在模式之后打印5个字符。

echo hey there how are you |perl -lne 'print "$1$2$3" if /(.{3})(there)(.{5})/'
ey there how

这也可以应用于单词而不只是字符。以下将在实际匹配的字符串之前打印一个单词。

echo hey there how are you |perl -lne 'print $1 if /(\w+) there/'
hey

以下将在模式后打印一个单词：

echo hey there how are you |perl -lne 'print $2 if /(\w+) there (\w+)/'
how

随后将在模式之前打印一个单词，然后在模式之后打印实际单词，然后输出一个单词：

echo hey there how are you |perl -lne 'print "$1$2$3" if /(\w+)( there )(\w+)/'
hey there how

— P ...
source

0

您可以使用regexp grep查找+第二个grep突出显示

echo "some123_string_and_another" | grep -o -P '.{0,3}string.{0,4}' | grep string

23_string_and

— 安德鲁·芝林
source

0

我永远不会轻易记住这些神秘的命令修饰符，因此我选择了最重要的答案并将其转换为~/.bashrc文件中的函数：


cgrep() {
    # For files that are arrays 10's of thousands of characters print.
    # Use cpgrep to print 30 characters before and after search patttern.
    if [ $# -eq 2 ] ; then
        # Format was 'cgrep "search string" /path/to/filename'
        grep -o -P ".{0,30}$1.{0,30}" "$2"
    else
        # Format was 'cat /path/to/filename | cgrep "search string"
        grep -o -P ".{0,30}$1.{0,30}"
    fi
} # cgrep()

这是实际的样子：

$ ll /tmp/rick/scp.Mf7UdS/Mf7UdS.Source

-rw-r--r-- 1 rick rick 25780 Jul  3 19:05 /tmp/rick/scp.Mf7UdS/Mf7UdS.Source

$ cat /tmp/rick/scp.Mf7UdS/Mf7UdS.Source | cgrep "Link to iconic"

1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri

$ cgrep "Link to iconic" /tmp/rick/scp.Mf7UdS/Mf7UdS.Source

1:43:30.3540244000 /mnt/e/bin/Link to iconic S -rwxrwxrwx 777 rick 1000 ri

有问题的文件是一个连续的25K行，使用常规找不到希望的文件grep。

请注意，可以使用两种不同的方式来调用cgrep该parallels grep方法。

有一种创建函数的“更聪明”的方法，其中仅在设置时传递“ $ 2”，这将节省4行代码。我没有方便。有点像${parm2} $parm2。如果找到它，我将修改功能和此答案。

— WinEunuuchs2Unix
source