Prometheus-Grafana-Consul使用Docker搭建监控系统

前提条件:
系统类型:centos7.9
已经部署好Docker环境
适用于监控Linux系统

一、导入docker镜像(consul、grafana、prometheus、ansible)

点击查看代码
consul、grafana、prometheus 使用save-load
docker load -i consul.tar
docker load -i grafana.tar
docker load -i prom.tar

ansible的使用export-import
docker import ansible.tar ansible:v1  #ansible:v1 导入后的镜像名字和标签.

二、启动容器
1、在启动容器时你可以先为每个容器创建好volumes。
创建卷:

点击查看代码
docker volume create consul
docker volume create prometheus
docker volume create ansible
docker volume create grafana
启动容器:
docker run --name=grafana -itd -p 3000:3000 -v grafana:/usr/share/grafana grafana/grafana:8.2.3
docker run --name consul -d -v consul:/consul -p 8500:8500 consul:1.10.3
docker run --name prometheus -d -p 9090:9090 --privileged=true -v prometheus:/etc/prometheus prom/prometheus:v2.31.0
ansible启动时要指定command,加top命令是为了让容器一直保持运行状态。
docker run -itd --name=ansible -v ansible:/etc/ansible ansible:v1 top

2、不创建卷直接启动容器:

点击查看代码
docker run --name=grafana -itd grafana/grafana:8.2.3
docker run --name consul -d -p 8500:8500 consul:1.10.3
docker run --name prometheus -d -p 9090:9090 --privileged=true prom/prometheus:v2.31.0
ansible启动时要指定command,加top命令是为了让容器一直保持运行状态。
docker run -itd --name=ansible ansible:v1 top

三、批量部署node-exporter
1、部署完成后进入ansible容器:
docker exec -it ansible /bin/sh
进入ansible目录:
cd /etc/ansible
ls会看到以下文件:

点击查看代码
sh-4.2# ls
ansible.cfg  hosts  node-exporter.service  node-exporter.yml  node_exporter  ping.yml  roles

2、文件介绍:
node-exporter.service,此文件是将node-exporter 服务注册为systemctl工具管理。

点击查看代码
sh-4.2# cat node-exporter.service
[Unit]
Description=node-exporter
[Service]
PrivateTmp=true
Restart=always
Type=simple
ExecStart=/usr/local/bin/node_exporter
ExecStop=/bin/kill -s QUIT $MAINPID
ExecReload=/bin/kill -s HUP $MAINPID
[Install]
WantedBy=multi-user.target

node-exporter.yml此文件是ansible批量安装node-exporter的脚本文件:

点击查看代码
sh-4.2# cat node-exporter.yml
---
- hosts: linux
  tasks:
    - name: copy Node Exporter
      copy:
        src: /etc/ansible/node_exporter
        dest: /usr/local/bin/
        mode: 0755

    - name: copy Node Exporter.service
      copy:
        src: /etc/ansible/node-exporter.service
        dest: /etc/systemd/system/
        force: yes
        mode: 0755

    - name: Enable Node Exporter Service
      systemd:
        enabled: true
        name: node-exporter
        state: started

    - name: Open Node Exporter Port
      firewalld:
        immediate: true
        permanent: true
        port: 9100/tcp
        state: enabled
注:dest: /usr/local/bin/ 此目录可以更改为自己目录,如果更改,ExecStart=/usr/local/bin/node_exporter 此目录也要同时更改。

node_exporter 此文件是node-exporter的可执行文件。

hosts 此文件是要安装node-exporter的主机列表文件。

点击查看代码
sh-4.2# cat hosts
[linux]
172.16.4.102
[linux:vars]
ansible_password="P@ssw0rd"

执行批量安装命令:
在/etc/ansible目录中执行:

点击查看代码
ansible-playbook node-exporter.yml

注:prometheus的配置文件内容,可以根据自己的需求更改。

点击查看代码
/etc/prometheus $ cat prometheus.yml
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
      
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["10.16.0.41:9090"]
  - job_name: "consul-node-exporter"
    consul_sd_configs:
      - server: 10.16.0.41:8500
    scrape_interval: 5s
    relabel_configs:
      - source_labels: ['__meta_consul_tags']
        regex:  .*node.*
        action:  keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap

  - job_name: "consul-docker-exporter"
    consul_sd_configs:
      - server: 10.16.0.41:8500
    scrape_interval: 5s
    relabel_configs:
      - source_labels: ['__meta_consul_tags']
        regex:  .*docker.*
        action:  keep
      - regex: __meta_consul_service_metadata_(.+)
        action: labelmap

四、客户端注册

点击查看代码
curl -X PUT -d '{"id": "node-exporter","name": "node-exporter-172.16.4.40","address": "172.16.4.40","port": 9100,"tags": ["consul"],"checks": [{"http": "http://172.16.4.40:9100/metrics","interval": "5s"}]}' http://172.16.4.40:8500/v1/agent/service/register
注册时"id"不能重复。 172.16.4.40:9100 为客户端的IP 172.16.4.40:8500 为consul地址

五、拓展:
附上批量注册脚本:适用于连续的IP段

点击查看代码
#!/usr/bin/env bash
for i in `seq 1 57`  #取值1到57
do
  instance_id="172.16.1.$i"
  service_name="172.16.1.$i"
  ip="10.14.1.$i"
  port=9100
  curl -X PUT -d '{"id": "'"$instance_id"'","name": "'"$service_name"'","address": "'"$ip"'","port": '"$port"',"tags": ["'"$service_name"'"],"checks": [{"http": "http://'"$ip"':'"$port"'/metrics","interval": "5s"}]}'  http://172.16.1.1:8500/v1/agent/service/register
done
上一篇:Prometheus之grafana可视化


下一篇:Prometheus之node exporter