前言
前面我已经简单解释了Prometheus与Consul结合使用,今天分享一下blackbox_exporter.
# 一、Blackbox_exporter是什么? blackbox_exporter允许通过HTTP,HTTPS,DNS,TCP和ICMP对监控目标发起黑盒测试。 这个与白盒的区别一个由内而发起,黑盒是由外而发起的探测
二、Blackbox使用步骤
1.从官网下载
blackbox_exporter下载地址
源码地址:https://github.com/prometheus/blackbox_exporter
下载之后,我们这边的启动脚本为
#!/bin/bash
nohup /data/blackbox_exporter-0.19.0/blackbox_exporter &
重启的脚本为:
#!/bin/bash
curl -X POST http://127.0.0.1:9115/-/reload
以下为官网的yaml参考配置:
modules:
http_2xx_example:
prober: http
timeout: 5s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
valid_status_codes: [] # Defaults to 2xx
method: GET
headers:
Host: vhost.example.com
Accept-Language: en-US
Origin: example.com
no_follow_redirects: false
fail_if_ssl: false
fail_if_not_ssl: false
fail_if_body_matches_regexp:
- "Could not connect to database"
fail_if_body_not_matches_regexp:
- "Download the latest version here"
fail_if_header_matches: # Verifies that no cookies are set
- header: Set-Cookie
allow_missing: true
regexp: '.*'
fail_if_header_not_matches:
- header: Access-Control-Allow-Origin
regexp: '(\*|example\.com)'
tls_config:
insecure_skip_verify: false
preferred_ip_protocol: "ip4" # defaults to "ip6"
ip_protocol_fallback: false # no fallback to "ip6"
http_post_2xx:
prober: http
timeout: 5s
http:
method: POST
headers:
Content-Type: application/json
body: '{}'
http_basic_auth_example:
prober: http
timeout: 5s
http:
method: POST
headers:
Host: "login.example.com"
basic_auth:
username: "username"
password: "mysecret"
http_custom_ca_example:
prober: http
http:
method: GET
tls_config:
ca_file: "/certs/my_cert.crt"
http_gzip:
prober: http
http:
method: GET
compression: gzip
http_gzip_with_accept_encoding:
prober: http
http:
method: GET
compression: gzip
headers:
Accept-Encoding: gzip
tls_connect:
prober: tcp
timeout: 5s
tcp:
tls: true
tcp_connect_example:
prober: tcp
timeout: 5s
imap_starttls:
prober: tcp
timeout: 5s
tcp:
query_response:
- expect: "OK.*STARTTLS"
- send: ". STARTTLS"
- expect: "OK"
- starttls: true
- send: ". capability"
- expect: "CAPABILITY IMAP4rev1"
smtp_starttls:
prober: tcp
timeout: 5s
tcp:
query_response:
- expect: "^220 ([^ ]+) ESMTP (.+)$"
- send: "EHLO prober\r"
- expect: "^250-STARTTLS"
- send: "STARTTLS\r"
- expect: "^220"
- starttls: true
- send: "EHLO prober\r"
- expect: "^250-AUTH"
- send: "QUIT\r"
irc_banner_example:
prober: tcp
timeout: 5s
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp_example:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
source_ip_address: "127.0.0.1"
dns_udp_example:
prober: dns
timeout: 5s
dns:
query_name: "www.prometheus.io"
query_type: "A"
valid_rcodes:
- NOERROR
validate_answer_rrs:
fail_if_matches_regexp:
- ".*127.0.0.1"
fail_if_all_match_regexp:
- ".*127.0.0.1"
fail_if_not_matches_regexp:
- "www.prometheus.io.\t300\tIN\tA\t127.0.0.1"
fail_if_none_matches_regexp:
- "127.0.0.1"
validate_authority_rrs:
fail_if_matches_regexp:
- ".*127.0.0.1"
validate_additional_rrs:
fail_if_matches_regexp:
- ".*127.0.0.1"
dns_soa:
prober: dns
dns:
query_name: "prometheus.io"
query_type: "SOA"
dns_tcp_example:
prober: dns
dns:
transport_protocol: "tcp" # defaults to "udp"
preferred_ip_protocol: "ip4" # defaults to "ip6"
query_name: "www.prometheus.io"
如果有其他需求,可以参考上面提供的实例对应的改写
2.应用
从上面的配置yaml可以写了很多东西
暂时我们只使用了我们需要的内容 blackbox.yml
modules:
http_2xx:
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
valid_status_codes: [] # Defaults to 2xx
method: GET
# 响应符合正则表达式就失败
#fail_if_body_matches_regexp:
# - "OK|SUCCESS"
# 响应不符合正则表达式就失败
fail_if_body_not_matches_regexp:
- "OK|SUCCESS"
icmp: # ping方式检测
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
我们这边主要做两个方面:
1. 进行ping检测
2. 检测写的程序是否正常。因为内部有些项目已经规范返回"OK"或者"SUCCESS"两种,所以这种规范性的还是很好写的
blackbox_exporter部署成功之后就可以访问:http://127.0.0.1:9115 去验证
访问路径为
http://127.0.0.1:9115/probe?target=prometheus.io&module=http_2xx&debug=true
target: 监控的目标
module:使用的模块
debug 是否输出日志
关注以下指标
···
#HELP probe_icmp_duration_seconds Duration of icmp request by phase
#TYPE probe_icmp_duration_seconds gauge 延迟通断 注意phase=“rtt” )
probe_icmp_duration_seconds{phase=“resolve”} 0.19693231
probe_icmp_duration_seconds{phase=“rtt”} 0.159475017
probe_icmp_duration_seconds{phase=“setup”} 7.5679e-05
#HELP probe_success Displays whether or not the probe was a success
#TYPE probe_success gauge 是否返回正常(0 不正常 1 为正常)
probe_success 1
···
注意配合前面我介绍的Prometheus与 Consul 结合使用,这样就可以做到监控自助上下线,从而提高效率
风险事项
- 配置blackbox_exporter,最好安装node_exporter工具
- 如果是Linux注意linux中的ulimit 连接数的大小,不宜过小
总结
以上就是我分享的内容,后续我会继续分享 fping_exporter,还有 Promethues 中 PromeQl查询语言在告警中应用,与Grafana中展示