awk实战与总结

  1. awk简介
  2. 日志准备
  3. 实战
    1. 基本列输出
    2. 过滤输出
  4. 内部变量
    1. 自定义输出间隔
    2. 字符串匹配
    3. 文件拆分
    4. 统计
    5. awk脚本
    6. 环境变量
  5. 参考

这个问题在16年9月份,也就是在集中跟踪服务稳定性期间经常考虑,尽管它写于2017.08,但它属于2016.09。

线上服务的稳定性一方面要从代码入手,另外一方面,亦可以通过日志暴露许多没有考虑进去的问题。在上线过程中,也有必要根据日志分析中找出小概率事件引起的问题,比较直观且有依据。

所以日志的分析是比较日常的行为。主要从利器awk入手,很有必要加强。

awk简介

awk是取三位创始人 Alfred Aho、Peter Weinberger、 Brian Kernighan 的Family Name的首字符组合而成的程序语言。多么具有共创意识的典范,类似的还有zend,awk有一本相当经典的书《The AWK Programming Language》,在豆瓣上的评分是9.5,在亚马逊上已经卖到1505.00元。

15年的时候还在跟国峰争论将日志放置成一行的必要性,在后来的日志分析中才显得尤为重要。这一点就是在说日志分析时,方便分析工具的使用。

日志准备

一个日志文件举例,内容包含错误级别、日期、文件、行号、错误码、上游IP、耗时、错误码、错误内容等,可以认为是一道需要入手分析的菜肴。

ral-worker.log部分
WARNING: 11-23 17:11:30:  ral-worker * 32072 [php_ral.cpp:1022][logid=690103304 worker_id=32213 optime=1479892290.111171 msg=ral_write_log log_type=E_SUM caller=Bd_DB from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=niux_yunying method=connect prot=mysql remote_ip=%3A3306 cost=0 connect=0 read=0 write=0 trans=0 dbname=niux_yunying sql= err_no=2013 err_info=Lost+connection+to+MySQL+server+at+%27waiting+for+initial+communication+packet%27%2C+system+error%3A+95]
WARNING: 11-23 17:11:30:  ral-worker * 32072 [php_ral.cpp:1022][logid=690103304 worker_id=32213 optime=1479892290.111259 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=nuomi_yunying method=getConn prot=mysql remote_ip=%3A cost=0.22900390625 connect=0 read=0 write=0 trans=0 dbname=niux_yunying sql= err_no=10007 err_info=Connect+to+Mysql%28fengzongbao%40%3A-niux_yunying%29+failed extra=32213]
WARNING: 11-23 17:11:30:  ral-worker * 32072 [php_ral.cpp:1022][logid=690103304 worker_id=32213 optime=1479892290.111370 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=nuomi_yunying method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=nuomi_yunying sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 11-23 17:11:35:  ral-worker * 32072 [ext/standard/protocol/nshead.cpp:68][logid=695822546 worker_id=32216 optime=1479892295.945720 caller=RAL idc=sh uniqid=2276560565 concurrency=1 request_name= service=starmap retry=0/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=-1.000 connect=23.162 write=0.039 read=-1.000 pack=0.191 unpack=-1.000 err_no=-1 msg=NsheadProtocol+read+header+failed prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE]
WARNING: 11-23 17:11:35:  ral-worker * 32072 [rpc.cpp:295][logid=695822546 worker_id=32216 optime=1479892295.945803 log_type=E_TALK product=odp subsys=newapp module=oam user_ip= local_ip=10.99.19.36 caller=RAL idc=sh uniqid=2276560565 concurrency=1 request_name= service=starmap retry=0/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=46.565 connect=23.162 write=0.039 read=23.340 pack=0.191 unpack=-1.000 err_no=8 err_info=Talk+To+Server+Failed]
WARNING: 11-23 17:11:35:  ral-worker * 32072 [ext/standard/protocol/nshead.cpp:68][logid=695822546 worker_id=32216 optime=1479892295.992290 caller=RAL idc=sh uniqid=847896472 concurrency=1 request_name= service=starmap retry=1/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=46.565 connect=23.142 write=0.012 read=23.340 pack=0.191 unpack=-1.000 err_no=-1 msg=NsheadProtocol+read+header+failed prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE]
WARNING: 11-23 17:11:35:  ral-worker * 32072 [rpc.cpp:295][logid=695822546 worker_id=32216 optime=1479892295.992359 log_type=E_TALK product=odp subsys=newapp module=oam user_ip= local_ip=10.99.19.36 caller=RAL idc=sh uniqid=847896472 concurrency=1 request_name= service=starmap retry=1/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=46.492 connect=23.142 write=0.012 read=23.327 pack=0.191 unpack=-1.000 err_no=8 err_info=Talk+To+Server+Failed]
WARNING: 11-23 17:11:36:  ral-worker * 32072 [ext/standard/protocol/nshead.cpp:68][logid=695822546 worker_id=32216 optime=1479892296.038646 caller=RAL idc=sh uniqid=3921071287 concurrency=1 request_name= service=starmap retry=2/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=46.492 connect=23.070 write=0.016 read=23.327 pack=0.191 unpack=-1.000 err_no=-1 msg=NsheadProtocol+read+header+failed prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE]
WARNING: 11-23 17:11:36:  ral-worker * 32072 [rpc.cpp:295][logid=695822546 worker_id=32216 optime=1479892296.038734 log_type=E_TALK product=odp subsys=newapp module=oam user_ip= local_ip=10.99.19.36 caller=RAL idc=sh uniqid=3921071287 concurrency=1 request_name= service=starmap retry=2/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=-1.000 talk=46.321 connect=23.070 write=0.016 read=23.221 pack=0.191 unpack=-1.000 err_no=8 err_info=Talk+To+Server+Failed]
WARNING: 11-23 17:11:36:  ral-worker * 32072 [rpc.cpp:242][logid=695822546 worker_id=32216 optime=1479892296.038781 log_type=E_SUM product=odp subsys=newapp module=oam user_ip= local_ip=10.99.19.36 caller=RAL idc=sh uniqid=3921071287 concurrency=1 request_name= service=starmap retry=2/2 is_connect_retry=0 method=nshead conv=mcpack2 prot=nshead remote_ip=10.202.23.46:8000 interface= prot_code=-7 prot_info=NSHEAD_RET_PEARCLOSE req_len=157 res_len=0 cost=139.766 talk=46.321 connect=23.070 write=0.016 read=23.221 pack=0.191 unpack=-1.000 err_no=8 err_info=Talk+To+Server+Failed]

实战

与其他语言相比,awk数据驱动型语言。简单描述为,你可以在找到数据之后,可以按自定义的处理格式处理。

基本列输出

按列格式化输出:

输出指定列
$ tail ral-worker.log | awk '{print $16}'
caller=RAL
server=10.36.55.25
server=10.36.55.25
server=10.36.55.25
server=10.36.55.25
caller=RAL

过滤输出

只输出错误码不为0的日志
$ tail ral-worker.log.wf | awk '$11 !="errno=0"'
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341719911 worker_id=3677 optime=1502081741.767087 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341723082 worker_id=3679 optime=1502081741.771120 msg=ral_write_log log_type=E_SUM caller=Bd_DB from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=nuomi_mkt method=connect prot=mysql remote_ip=127.0.0.1%3A3306 cost=0 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=1049 err_info=Unknown+database+%27nuomi_mkt%27]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341723082 worker_id=3679 optime=1502081741.771381 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.343994140625 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3679]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341723082 worker_id=3679 optime=1502081741.771450 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341718449 worker_id=3675 optime=1502081741.775986 msg=ral_write_log log_type=E_SUM caller=Bd_DB from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=nuomi_mkt method=connect prot=mysql remote_ip=127.0.0.1%3A3306 cost=0 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=1049 err_info=Unknown+database+%27nuomi_mkt%27]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341718449 worker_id=3675 optime=1502081741.776250 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.2578125 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3675]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341718449 worker_id=3675 optime=1502081741.776308 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341763357 worker_id=3681 optime=1502081741.776867 msg=ral_write_log log_type=E_SUM caller=Bd_DB from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=nuomi_mkt method=connect prot=mysql remote_ip=127.0.0.1%3A3306 cost=0 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=1049 err_info=Unknown+database+%27nuomi_mkt%27]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341763357 worker_id=3681 optime=1502081741.777114 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.281005859375 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3681]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341763357 worker_id=3681 optime=1502081741.777185 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]

其中的“!=”为比较运算符。其他比较运算符:==, >, <, >=, <=

只输出log_type=E_WARN的日志
tail ral-worker.log.wf | awk '$11 =="log_type=E_WARN"'
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341719911 worker_id=3677 optime=1502081741.767087 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341723082 worker_id=3679 optime=1502081741.771381 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.343994140625 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3679]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341723082 worker_id=3679 optime=1502081741.771450 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341718449 worker_id=3675 optime=1502081741.776250 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.2578125 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3675]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341718449 worker_id=3675 optime=1502081741.776308 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341763357 worker_id=3681 optime=1502081741.777114 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip=127.0.0.1%3A3306 cost=1.281005859375 connect=0 read=0 write=0 trans=0 dbname=nuomi_mkt sql= err_no=10007 err_info=Connect+to+Mysql%28root%40127.0.0.1%3A3306-nuomi_mkt%29+failed extra=3681]
WARNING: 08-07 12:55:41:  ral-worker * 3572 [php_ral.cpp:1022][logid=3341763357 worker_id=3681 optime=1502081741.777185 msg=ral_write_log log_type=E_WARN caller=ConnMgr from=/home/users/fengzongbao/odp_market/php/phplib/bd/db/RALLog.php:100 service=oam_mis method=getConn prot=mysql remote_ip= cost=0 connect=0 read=0 write=0 trans=0 dbname=oam_mis sql= err_no=10006 err_info=No+host+could+be+connected+in+the+cluster]

内部变量

说到了内建变量,我们可以来看看awk的一些内建变量:

$0 当前记录(这个变量中存放着整个行的内容)
$1~$n 当前记录的第n个字段,字段间由FS分隔
FS 输入字段分隔符 默认是空格或Tab
NF 当前记录中的字段个数,就是有多少列
NR 已经读出的记录数,就是行号,从1开始,如果有多个文件话,这个值也是不断累加中
FNR 当前记录数,与NR不同的是,这个值会是各个文件自己的行号
RS 输入的记录分隔符, 默认为换行符
OFS 输出字段分隔符, 默认也是空格
ORS 输出的记录分隔符,默认为换行符
FILENAME 当前输入文件的名字

输出行号:

只输出log_type=E_WARN的日志
tail ral-worker.log.wf | awk '$11 =="log_type=E_WARN" {printf "%02s %s %s %s\n",NR, FNR, $11, $16}'
 1 1 log_type=E_WARN prot=mysql
 3 3 log_type=E_WARN prot=mysql
 4 4 log_type=E_WARN prot=mysql
 6 6 log_type=E_WARN prot=mysql
 7 7 log_type=E_WARN prot=mysql
 9 9 log_type=E_WARN prot=mysql
10 10 log_type=E_WARN prot=mysql

自定义输出间隔

$ awk  'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd
root 0 /root
daemon 1 /usr/sbin
bin 2 /bin
sys 3 /dev
sync 4 /bin
games 5 /usr/games
man 6 /var/cache/man
lp 7 /var/spool/lpd

上面的命令也等价于:(-F的意思就是指定分隔符)

$ awk  -F: '{print $1,$3,$6}' /etc/passwd

以\t作为分隔符输出的例子(下面使用了/etc/passwd文件,这个文件是以:分隔的):

$ awk  -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd
root 0 /root
daemon 1 /usr/sbin
bin 2 /bin
sys 3 /dev
sync 4 /bin
games 5 /usr/games
man 6 /var/cache/man
lp 7 /var/spool/lpd

字符串匹配

awk '$33 ~ /server/' OFS="\t" ral-worker.log
NOTICE: 08-02 16:39:05: ral-worker * 20809 [rpc.cpp:240][logid=2344882345 worker_id=20813 optime=1501663145.026209 log_type=E_SUM product=odp subsys=newapp module=engine user_ip= local_ip=10.145.81.150 caller=RAL idc=sh uniqid=233576612 concurrency=1 request_name=0 service=galaxy_goods retry=0/1 is_connect_retry=0 method=POST conv=form prot=http remote_ip=10.202.6.208:80 interface= prot_code=200 prot_info= curl_code=0 curl_errmsg= uri=nop/server/rest req_len=159 res_len=644 cost=64.226 talk=64.043 connect=26.894 write=0.000 read=63.972 pack=0.017 unpack=0.004 err_no=0]
NOTICE: 08-03 21:53:13: ral-worker * 3572 [rpc.cpp:240][logid=3193817764 worker_id=3585 optime=1501768393.920692 log_type=E_SUM product=odp subsys=newapp module=engine user_ip= local_ip=10.145.81.150 caller=RAL idc=sh uniqid=2655659683 concurrency=1 request_name=0 service=galaxy_goods retry=0/1 is_connect_retry=0 method=POST conv=form prot=http remote_ip=10.202.6.208:80 interface= prot_code=200 prot_info= curl_code=0 curl_errmsg= uri=nop/server/rest req_len=159 res_len=644 cost=67.672 talk=64.092 connect=27.456 write=0.000 read=64.015 pack=0.013 unpack=0.005 err_no=0]
NOTICE: 08-04 11:34:25: ral-worker * 3572 [rpc.cpp:240][logid=2065673224 worker_id=3590 optime=1501817665.763168 log_type=E_SUM product=odp subsys=newapp module=engine user_ip= local_ip=10.145.81.150 caller=RAL idc=sh uniqid=3920350387 concurrency=1 request_name=0 service=galaxy_goods retry=0/1 is_connect_retry=0 method=POST conv=form prot=http remote_ip=10.202.6.208:80 interface= prot_code=200 prot_info= curl_code=0 curl_errmsg= uri=nop/server/rest req_len=159 res_len=644 cost=64.657 talk=64.602 connect=26.746 write=0.000 read=64.528 pack=0.013 unpack=0.003 err_no=0]

示例匹配uri包含server的请求。其实 ~ 表示模式开始。//中是模式。这就是一个正则表达式的匹配。

也可以是用~ /server| galaxy/'表示“或者的关系”。

文件拆分

awk拆分文件很简单,使用重定向即可。
下面这个例子,是按第6例分隔文件,相当的简单(其中的NR!=1表示不处理表头)。

$ netstat | head -n 10 | awk 'NR!=1{print > $6}'
$ ls -l
total 16
-rw-rw-r-- 1 fengzongbao fengzongbao 80 Aug 10 17:55 CLOSE_WAIT
-rw-rw-r-- 1 fengzongbao fengzongbao 240 Aug 10 17:55 ESTABLISHED
-rw-rw-r-- 1 fengzongbao fengzongbao 80 Aug 10 17:55 Foreign
-rw-rw-r-- 1 fengzongbao fengzongbao 320 Aug 10 17:55 TIME_WAIT

统计

下面的命令计算所有的C文件,CPP文件和H文件的文件大小总和。

$ ls -l  *.php | awk '{sum+=$5} END {print sum}'
1320

查看客户端连接:

$ netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr
40 10.212.15.235
31 10.207.6.125
11 10.145.81.150
10 127.0.0.1
3 10.128.246.28
2 10.107.65.36
1 servers)

统计每个客户端对应的请求次数:

awk '{a[$1]++;} END {for (i in a) print i "," a[i]}' access_log
172.20.204.129,206
172.22.152.101,1
172.22.152.103,1533
172.20.204.37,5
172.22.152.105,281
172.22.145.28,2
172.20.204.149,247

每个用户的进程的占了多少内存:

$ ps aux | awk 'NR!=1{a[$1]+=$6;} END { for(i in a) print i ", " a[i]"KB";}'
message+, 1488KB
whoopsie, 4596KB
syslog, 6368KB
www-data, 4758136KB
99, 16288KB
mysql, 117776KB
ntp, 1956KB
postfix, 2992KB
root, 673420KB
fengzon+, 4604KB

awk脚本

在上面我们可以看到一个END关键字。END的意思是“处理完所有的行的标识”,即然说到了END就有必要介绍一下BEGIN,这两个关键字意味着执行前和执行后的意思,语法如下:

假设有这么一个文件(学生成绩表):

$ cat score.txt
Marry 2143 78 84 77
Jack 2321 66 78 45
Tom 2122 48 77 71
Mike 2537 87 97 95
Bob 2415 40 57 62

我们的awk脚本如下(我没有写有命令行上是因为命令行上不易读,另外也在介绍另一种用法):

$ cat cal.awk
#!/bin/awk -f
#运行前
BEGIN {
math = 0
english = 0
computer = 0

printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n"
printf "---------------------------------------------\n"
}
#运行中
{
math+=$3
english+=$4
computer+=$5
printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5
}
#运行后
END {
printf "---------------------------------------------\n"
printf " TOTAL:%10d %8d %8d \n", math, english, computer
printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}

我们来看一下执行结果:(也可以这样运行 ./cal.awk score.txt)

$ awk -f cal.awk score.txt
NAME NO. MATH ENGLISH COMPUTER TOTAL
---------------------------------------------
Marry 2143 78 84 77 239
Jack 2321 66 78 45 189
Tom 2122 48 77 71 196
Mike 2537 87 97 95 279
Bob 2415 40 57 62 159
---------------------------------------------
TOTAL: 319 393 350
AVERAGE: 63.80 78.60 70.00

环境变量

akw与环境变量的交互:(使用-v参数和ENVIRON,使用ENVIRON的环境变量需要export)

$ x=5

$ y=10
$ export y

$ echo $x $y
5 10

$ awk -v val=$x '{print $1, $2, $3, $4+val, $5+ENVIRON["y"]}' OFS="\t" score.txt
Marry 2143 78 89 87
Jack 2321 66 83 55
Tom 2122 48 82 81
Mike 2537 87 102 105
Bob 2415 40 62 72

参考

script>