如何执行多行grep


15

您将如何对出现在两行上的文本执行grep?

例如:

pbsnodes 是我使用的命令,该命令返回Linux集群的利用率

root$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar

我想确定与处于“ free”状态的节点匹配的proc的数量。到目前为止,我已经能够确定“进程数”和“处于空闲状态的节点”,但是我想将它们组合成一个显示所有空闲进程的命令。

在上面的示例中,正确答案将是6(2 + 4)。

我有的

root$ NUMBEROFNODES=`pbsnodes|grep 'state = free'|wc -l`
root$ echo $NUMBEROFNODES
2

root$ NUMBEROFPROCS=`pbsnodes |grep "procs = "|awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'`
root$ echo $NUMBEROFPROCS
14

如何搜索每行读为“ procs = x”的行,但前提是该行上方的行读为“ state = free”?

Answers:


12

如果数据始终采用这种格式,则可以简单地将其编写为:

awk -vRS= '$4 == "free" {n+=$7}; END {print n}'

RS=意味着记录是段落)。

要么:

awk -vRS= '/state *= *free/ && match($0, "procs *=") {
  n += substr($0,RSTART+RLENGTH)}; END {print n}'

5
$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar
$ pbsnodes | grep -A 1 free
    state = free
    procs = 2
--
    state = free
    procs = 4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}'
2
4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ 
2+4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ | bc 
6

https://zh.wikipedia.org/wiki/管道_(Unix)


4

这是使用的一种方法pcregrep

$ pbsnodes | pcregrep -Mo 'state = free\n\s*procs = \K\d+'
2
4

$ pbsnodes | \
    pcregrep -Mo 'state = free\n\s*procs = \K\d+' | \
    awk '{ sum+=$1 }; END { print sum }'
6

3

您的输出格式适合Perl的段落外观:

pbsnodes|perl -n00le 'BEGIN{ $sum = 0 }
                 m{
                   state \s* = \s* free \s* \n 
                   procs \s* = \s* ([0-9]+)
                 }x 
                    and $sum += $1;
                 END{ print $sum }'

注意

这之所以起作用,是因为Perl的“段落”概念是一大堆非空白行,这些行由一个或多个空白行分隔。如果各node部分之间没有空白行,则此方法无效。

也可以看看


3

如果您有固定长度的数据(固定长度是指记录中的行数),则sed可以使用N命令(几次),该命令将下一行连接到模式空间:

sed -n '/^node/{N;N;N;s/\n */;/g;p;}'

应该给你这样的输出:

node1;state = free;procs = 2;bar = foobar
node2;state = free;procs = 4;bar = foobar
node3;state = busy;procs = 8;bar = foobar

对于可变的记录组成(例如,空的分隔符行),可以使用分支命令tb,但是awk可能会以一种更舒适的方式使您到达那里。


3

GNU的实现grep带有两个参数,也可以在匹配之前(-B)和之后(-A)打印行。手册页中的摘录:

   -A NUM, --after-context=NUM
          Print NUM lines of trailing context after matching lines.  Places a line containing  a  group  separator  (--)  between  contiguous  groups  of  matches.   With  the  -o  or
          --only-matching option, this has no effect and a warning is given.

   -B NUM, --before-context=NUM
          Print  NUM  lines  of  leading  context  before  matching  lines.   Places  a  line  containing  a group separator (--) between contiguous groups of matches.  With the -o or
          --only-matching option, this has no effect and a warning is given.

因此,在您的情况下,您将必须grep state = free并打印以下行。将其与您问题中的摘录结合起来,您将得到如下结果:

usr@srv % pbsnodes | grep -A 1 'state = free' | grep "procs = " | awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'
6

和更短:

usr@srv % pbsnodes | grep -A 1 'state = free' | awk '{ sum+=$3 } END { print sum }'
6

awk模式匹配;您不需要grep:请参见Stephane的答案
jasonwryan

好,sed模式匹配也是如此。您还可以使用perl,或php或您喜欢的任何语言。但至少问题的标题要求多行grep ... ;-)
binfalse 2013年

是的:但是看到您仍然在使用awk... :)
jasonwryan

0

...这是一个Perl解决方案:

pbsnodes | perl -lne 'if (/^\S+/) { $node = $& } elsif ( /state = free/ ) { print $node }'

0

您可以使用以下awk getline命令:

$ pbsnodes | awk 'BEGIN { freeprocs = 0 } \
                  $1=="state" && $3=="free" { getline; freeprocs+=$3 } \
                  END { print freeprocs }'

来自man awk :

   getline               Set $0 from next input record; set NF, NR, FNR.

   getline <file         Set $0 from next record of file; set NF.

   getline var           Set var from next input record; set NR, FNR.

   getline var <file     Set var from next record of file.

   command | getline [var]
                         Run command piping the output either into $0 or var, as above.

   command |& getline [var]
                         Run  command  as a co-process piping the output either into $0 or var, as above.  Co-processes are a
                         gawk extension.
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.