如何执行多行grep

15

您将如何对出现在两行上的文本执行grep？

例如：

pbsnodes 是我使用的命令，该命令返回Linux集群的利用率

root$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar

我想确定与处于“ free”状态的节点匹配的proc的数量。到目前为止，我已经能够确定“进程数”和“处于空闲状态的节点”，但是我想将它们组合成一个显示所有空闲进程的命令。

在上面的示例中，正确答案将是6（2 + 4）。

我有的

root$ NUMBEROFNODES=`pbsnodes|grep 'state = free'|wc -l`
root$ echo $NUMBEROFNODES
2

root$ NUMBEROFPROCS=`pbsnodes |grep "procs = "|awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'`
root$ echo $NUMBEROFPROCS
14

如何搜索每行读为“ procs = x”的行，但前提是该行上方的行读为“ state = free”？

— ud子
source

12

如果数据始终采用这种格式，则可以简单地将其编写为：

awk -vRS= '$4 == "free" {n+=$7}; END {print n}'

（RS=意味着记录是段落）。

要么：

awk -vRS= '/state *= *free/ && match($0, "procs *=") {
  n += substr($0,RSTART+RLENGTH)}; END {print n}'

— StéphaneChazelas
source

5

$ pbsnodes
node1
    state = free
    procs = 2
    bar = foobar

node2
    state = free
    procs = 4
    bar = foobar

node3
    state = busy
    procs = 8
    bar = foobar
$ pbsnodes | grep -A 1 free
    state = free
    procs = 2
--
    state = free
    procs = 4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}'
2
4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ 
2+4
$ pbsnodes | grep -A 1 free | grep procs | awk '{print $3}' | paste -sd+ | bc 
6

https://zh.wikipedia.org/wiki/管道_（Unix）

— 顶级掠食者
source

4

这是使用的一种方法pcregrep。

$ pbsnodes | pcregrep -Mo 'state = free\n\s*procs = \K\d+'
2
4

例

$ pbsnodes | \
    pcregrep -Mo 'state = free\n\s*procs = \K\d+' | \
    awk '{ sum+=$1 }; END { print sum }'
6

— slm
source

3

您的输出格式适合Perl的段落外观：

pbsnodes|perl -n00le 'BEGIN{ $sum = 0 }
                 m{
                   state \s* = \s* free \s* \n 
                   procs \s* = \s* ([0-9]+)
                 }x 
                    and $sum += $1;
                 END{ print $sum }'

注意

这之所以起作用，是因为Perl的“段落”概念是一大堆非空白行，这些行由一个或多个空白行分隔。如果各node部分之间没有空白行，则此方法无效。

也可以看看

— 约瑟夫·R。
source

3

如果您有固定长度的数据（固定长度是指记录中的行数），则sed可以使用N命令（几次），该命令将下一行连接到模式空间：

sed -n '/^node/{N;N;N;s/\n */;/g;p;}'

应该给你这样的输出：

node1;state = free;procs = 2;bar = foobar
node2;state = free;procs = 4;bar = foobar
node3;state = busy;procs = 8;bar = foobar

对于可变的记录组成（例如，空的分隔符行），可以使用分支命令t和b，但是awk可能会以一种更舒适的方式使您到达那里。

— 彼得
source

3

GNU的实现grep带有两个参数，也可以在匹配之前（-B）和之后（-A）打印行。手册页中的摘录：

   -A NUM, --after-context=NUM
          Print NUM lines of trailing context after matching lines.  Places a line containing  a  group  separator  (--)  between  contiguous  groups  of  matches.   With  the  -o  or
          --only-matching option, this has no effect and a warning is given.

   -B NUM, --before-context=NUM
          Print  NUM  lines  of  leading  context  before  matching  lines.   Places  a  line  containing  a group separator (--) between contiguous groups of matches.  With the -o or
          --only-matching option, this has no effect and a warning is given.

因此，在您的情况下，您将必须grep state = free并打印以下行。将其与您问题中的摘录结合起来，您将得到如下结果：

usr@srv % pbsnodes | grep -A 1 'state = free' | grep "procs = " | awk  '{ print $3 }' | awk '{ sum+=$1 } END { print sum }'
6

和更短：

usr@srv % pbsnodes | grep -A 1 'state = free' | awk '{ sum+=$3 } END { print sum }'
6

— 虚假
source

awk模式匹配；您不需要grep：请参见Stephane的答案

— jasonwryan

好，sed模式匹配也是如此。您还可以使用perl，或php或您喜欢的任何语言。但至少问题的标题要求多行grep ... ;-)

— binfalse 2013年

是的：但是看到您仍然在使用awk... :)

— jasonwryan

0

...这是一个Perl解决方案：

pbsnodes | perl -lne 'if (/^\S+/) { $node = $& } elsif ( /state = free/ ) { print $node }'

— Reinierpost
source

0

您可以使用以下awk getline命令：

$ pbsnodes | awk 'BEGIN { freeprocs = 0 } \
                  $1=="state" && $3=="free" { getline; freeprocs+=$3 } \
                  END { print freeprocs }'

来自man awk ：

   getline               Set $0 from next input record; set NF, NR, FNR.

   getline <file         Set $0 from next record of file; set NF.

   getline var           Set var from next input record; set NR, FNR.

   getline var <file     Set var from next record of file.

   command | getline [var]
                         Run command piping the output either into $0 or var, as above.

   command |& getline [var]
                         Run  command  as a co-process piping the output either into $0 or var, as above.  Co-processes are a
                         gawk extension.

— Skippy le Grand Gourou
source