文章目录
阅读提示
- 阅读本文,假设你仅仅安装了 prometheus ,并且简单设置了一个 node_exporter
- 适合不了解 prometheus 的查询语法,却对语法比较感兴趣
- 本文主要讲解通用的一些 告警规则
一. prometheus 启用告警功能
- prometheus 本身没有告警功能,需要额外安装一个 alertmanager
- 下面是简要步骤,参考自文章
https://blog.csdn.net/aixiaoyang168/article/details/98474494
# 注意替换 xxxx ,填写发送邮箱地址,用户名,密码,收件邮箱地址 1. cat > /root/config.yml << EOD global: resolve_timeout: 5m smtp_from: 'xxxxxx@163.com' smtp_smarthost: 'smtp.163.com:465' smtp_auth_username: 'xxxx@163.com' smtp_auth_password: 'xxxxx' smtp_require_tls: false smtp_hello: '163.com' route: group_by: ['alertname'] group_wait: 5s group_interval: 5s repeat_interval: 5m receiver: 'email' receivers: - name: 'email' email_configs: - to: 'xxxxx@163.com' send_resolved: true inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] EOD 2. docker run -d -p '9093:9093' --name alertmanager -v "/root/config.yml:/etc/alertmanager/config.yml" bitnami/alertmanager:latest 3. vim /root/node-status.rules groups: - name: node-up rules: - alert: node-up expr: count(up == 0) > 0 for: 15s labels: severity: 1 team: node annotations: summary: "{{ $labels.instance }} 已停止运行超过 15s!" 4. vi prometheus.yml alerting: alertmanagers: - static_configs: - targets: - 1.1.1.1:9093 rule_files: - "/root/*.rules" 5. 重启一下prometheus,或者热加载一下
感觉每次都会少写一个 e,prometheus 写成 promethus,吐血
- 下面是成功的示例,错了的话,看下日志,检查一下配置