logstash的功能有一点是把 各种软件生成的各种格式的日志 转换成一个方便检索筛选的格式,本文演示了一个最简单的例子。
一 转换的效果
实例: rabbitmq-server 日志:
=INFO REPORT==== 16-Jan-2017::09:27:09 ===
Mirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>
转换后的格式为:
{
"year" => "2017",
"mounthday" => "16",
"logdata" => "Mirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>",
"message" => "=INFO REPORT==== 16-Jan-2017::09:27:09 ===\nMirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>",
"type" => "rabbit",
"tags" => [
[0] "multiline"
],
"path" => "/var/log/rabbitmq/rabbit@server-31.log",
"@timestamp" => 2017-01-16T01:27:09.718Z,
"loglevel" => "INFO",
"@version" => "1",
"host" => "server-31",
"time" => "09:27:09",
"mounth" => "Jan"
}
转换后的内容传入elasticsearch中,用户就可以按照时间、日志等级、主机等对汇总的日志进行筛选检索
二 转换的过程
还是以刚才那条日志为例
=INFO REPORT==== 16-Jan-2017::09:27:09 ===
Mirrored queue 'reply_963a14cce15f48e786240aad41817847' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2262.0>
=INFO REPORT==== 16-Jan-2017::09:27:09 ===
Mirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>
=INFO REPORT==== 16-Jan-2017::09:27:09 ===
Mirrored queue 'q-agent-notifier-network-update' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2270.0>
日志是多行,前后各有一行空行,日志行以=开头,
1、多行合并
首先是合并多行,
安装多行插件:
/usr/share/logstash/bin/logstash-plugin install logstash-filter-multiline
在配置文件中配置多行合并
codec => multiline {
pattern => "^="
what => "previous"
negate => true
}
最终日志转换为 =INFO REPORT==== 16-Jan-2017::09:27:09 ===\nMirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>
2、分析日志的格式和规律
结合所有的rabbitmq的日志总结规律为
=“日志级别” REPORT==== "日期"::“时间” ===\n“日志内容”
注意不要忘记中间的空格
3、正则匹配
logstash内置了很多常规正则,参见
https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns
本文都是采用内置的正则
=INFO REPORT==== 16-Jan-2017::09:27:09 ===\nMirrored queue 'heat-engine-listener.e9e416bb-6733-4981-bf00-bd64c104ccad' in vhost '/': Adding mirror on node 'rabbit@server-31': <0.2266.0>
我最终的匹配的表达式为
^=%{LOGLEVEL:loglevel} REPORT=+ %{MONTHDAY:mounthday}-%{MONTH:mounth}-%{YEAR:year}::%{TIME:time} ===\n%{GREEDYDATA:logdata}$
%{LOGLEVEL:loglevel}表示这是一个变量,里面的内容要匹配logstash内置的LOGLEVE正则,并且里面的内容和loglevel这个key形成一对kv值:"loglevel":"INFO"
其他一直类推
logstash提供了一个测试表达式的网址http://grokdebug.herokuapp.com/