1. 首先配置nginx的日志,需要配置成json日志,这个方法不在这里介绍,网上很多,我提供一个format参数大家参考。
log_format main escape=json ‘{"system_name":"$system_name",‘ #系统名称 # 客户端(访问者)信息 ‘"remote_addr":"$remote_addr",‘ ‘"remote_port":"$remote_port",‘ ‘"remote_user":"$remote_user",‘ # 请求信息 ‘"request":"$request",‘ ‘"request_body":"$request_body",‘ ‘"request_length":"$request_length",‘ ‘"request_method":"$request_method",‘ ‘"request_time":"$time_iso8601",‘ ‘"request_uri":"$uri",‘ ‘"request_args":"$args",‘ ‘"http_referer":"$http_referer",‘ ‘"http_cookie":"$http_cookie",‘ ‘"http_user_agent":"$http_user_agent",‘ ‘"http_x_forwarded_for":"$http_x_forwarded_for",‘ ‘"http_host":"$http_host",‘ ‘"http_status": "$status",‘ ‘"server_addr":"$server_addr",‘ ‘"server_name":"$server_name",‘ ‘"server_port":"$server_port",‘ ‘"ups_time":"$upstream_response_time",‘ ‘"ups_status":"$upstream_status",‘ ‘"ups_server":"$upstream_http_server",‘ ‘"ups_addr": "$upstream_addr"}‘;
其中,我这里的system_name是需要在nginx配置中进行set的,由于nginx代理了多个系统,加个system_name来标识系统名称比较好。
另外,我的日志存储也统一由一个文件保存,配置为
if ($time_iso8601 ~ "^(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2}):(\d{2})") { set $year $1; set $month $2; set $day $3; set $hour $4; set $minutes $5; set $seconds $6; } access_log logs/host.access-$year-$month-$day.log main;
按日来拆分日志,避免日志过大时还不好删除。
2. 配置Logstash,我的服务器是linux,这里的命令也是用的Linux的命令,如果用windows的就要自己想办法转换了。
安装logstash,这个没什么难度。
安装jdbc和uuid插件,uuid这个可选,具体看后面。
bin/logstash-plugin install logstash-output-jdbc
bin/logstash-plugin install logstash-filter-uuid
jdbc插件并不安装就完事了,还需要在配置中引入jdbc的jar包,具体看配置
在config目录下添加一个logstash.conf的文件,内容如下,里面包含讲解
#sample Logstash configuration for creating a simple # Beats -> Logstash -> Elasticsearch pipeline. input { file { # 指定需要扫描的日志文件,支持多个文件,也支持星号(*)通配符 # 含义:扫描/usr/local/nginx/logs/目录下的所有以host.access-开头,以log为扩展名的日志文件。 # path是数组,所以很明显,可以配置成多个 path => ["/usr/local/nginx/logs/host.access-*.log"] } } filter{ #nginx生成的日志,是很简单的json数据,但是经过Logstash读取后,会自动添加到message节点下,这很麻烦,所以第一步,把message里的数据提出来,然后删掉Message。 json{ source => "message" remove_field => ["message"] } #nginx生成的日志没有ID,我的数据库的ID是个nv2格式,所以需要一个字符器ID,这里需要安装uuid插件 uuid{ target => "id" overwrite => true } mutate{ convert => { "ups_connect_time" => "float"} convert => { "ups_status" => "integer"} convert => { "server_port" => "integer"} convert => { "ups_response_length" => "integer"} convert => { "remote_port" => "integer"} convert => { "ups_time" => "float"} convert => { "http_status" => "integer"} convert => { "request_length" => "integer"} convert => { "proxy_port" => "integer"} #Nginx生成的时间是2021-06-29T12:00:00+08:00形式的,需要转成2021-06-29 12:00:00,以便在SQL语句中使用to_date进行时间格式化 #去掉+08:00 和中间的 T,这里因为匹配是用正则的,所以+号的匹配要用\+,不然会报错。 gsub =>["request_time","\+08:00",""] gsub =>["request_time","T"," "] gsub =>["User_Agent","\"",""] } } output { jdbc { #这个是自行下载的JAR包 driver_jar_path => "/data/plugins/ojdbc8.jar" #oracle的数据库连接参数to_date(?,‘YYYY-MM-DD HH24:MI:SS‘)时间格式化,否则时间会报错,数据录不进数据库 connection_string => "jdbc:oracle:thin:username/password@id:1521/sid" #插入语句,这里在插入中使用了 statement => [ "INSERT INTO NGINX_LOGS(ID, SYSTEM_NAME, REMOTE_ADDR, REMOTE_PORT, REMOTE_USER, REQUEST, REQUEST_LENGTH, REQUEST_METHOD, REQUEST_TIME, REQUEST_URI, REQUEST_ARGS, HTTP_REFERER, HTTP_COOKIE, HTTP_USER_AGENT, HTTP_X_FORWARDED_FOR, HTTP_HOST, HTTP_STATUS, PROXY_HOST, PROXY_PORT, SERVER_ADDR, SERVER_NAME, SERVER_PORT, UPS_TIME, UPS_STATUS, UPS_CONNECT_TIME, UPS_RESPONSE_LENGTH, UPS_SERVER, UPS_ADDR) VALUES(?,?,?,?,?,?,?,?,to_date(?,‘YYYY-MM-DD HH24:MI:SS‘),?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)","id","system_name","remote_addr","remote_port","remote_user","request","request_length","request_method","request_time","request_uri","request_args","http_referer","http_cookie","http_user_agent","http_x_forwarded_for","http_host","http_status","proxy_host","proxy_port","server_addr","server_name","server_port","ups_time","ups_status","ups_connect_time","ups_response_length","ups_server","ups_addr"] } stdout { codec => json_lines } }