如何从日志文件中提取用户代理字符串?


12

目前,我正在运行这样的命令,以获取最需要的内容:

grep "17\/Jul\/2011" other_vhosts_access.log | awk '{print $8}' | sort | uniq -c | sort -nr

我现在想查看用户代理字符串,但是问题是它们包含几个空格。这是典型的日志文件行。UA是最后一个由引号引起来的部分:

example.com:80 [ip] - - [17/Jul/2011:23:59:59 +0100] "GET [url] HTTP/1.1" 200 6449 "[referer]" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.122 Safari/534.30"

有没有比awk更好的工具呢?

Answers:


19

如果该格式是一致的并且该字段确实用双引号引起来,则可以使用awk或cut with "作为字段分隔符:

awk -F\" '{print $6}'

要么:

cut -d\" -f 6

3
perl -ne'if(/“([[^”] +)“ $ /){$ ua {$ 1} ++;} END {for(keys%ua){print” $ ua {$ _} $ _ \ n “}}'\
  access_log
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.