cAdvisor已经内置在了 kubelet 组件之中,所以我们不需要单独去安装,cAdvisor的数据路径为/api/v1/nodes/<node>/proxy/metrics
1、增加job,更新prometheus配置
- job_name: ‘kubernetes-cadvisor‘
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
$ kubectl apply -f prometheus-cm.yaml
$ kubectl get svc -n kube-ops |grep prometheus
prometheus NodePort 10.102.197.83 <none> 9090:32619/TCP
$ curl -X POST "http://10.102.197.83:9090/-/reload" #使配置生效
监控apiserver
1、增加job,更新prometheus配置
- job_name: ‘kubernetes-apiservers‘
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
$ kubectl apply -f prometheus-cm.yaml
$ kubectl get svc -n kube-ops |grep prometheus
prometheus NodePort 10.102.197.83 <none> 9090:32619/TCP
$ curl -X POST "http://10.102.197.83:9090/-/reload" #使配置生效
配置普通svc的自动发现和监控
1、增加job,更新prometheus配置
- job_name: ‘kubernetes-service-endpoints‘
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
要想自动发现集群中的 Service,就需要我们在 Service 的annotation区域添加:prometheus.io/scrape=true的声明
$ kubectl apply -f prometheus-cm.yaml
$ kubectl get svc -n kube-ops |grep prometheus
prometheus NodePort 10.102.197.83 <none> 9090:32619/TCP
$ curl -X POST "http://10.102.197.83:9090/-/reload" #使配置生效
2、修改redis的svc,能够动态发现并监控(在静态发现的基础上)
新增 annotations
kind: Service
apiVersion: v1
metadata:
name: redis
namespace: kube-ops
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9121"
spec:
selector:
app: redis
ports:
- name: redis
port: 6379
targetPort: 6379
- name: prom
port: 9121
targetPort: 9121
在之前创建的 redis 这个 Service 中添加上prometheus.io/scrape=true这个 annotation
由于 redis 服务的 metrics 接口在9121这个 redis-exporter 服务上面,所以我们还需要添加一个prometheus.io/port=9121这样的annotations
$ kubectl apply -f prome-redis.yaml
3、修改trafik的svc,能够动态发现并监控(在静态发现的基础上)
apiVersion: v1
metadata:
name: traefik-ingress-service
namespace: kube-system
annotations:
prometheus.io/scrape: "true" #新增
prometheus.io/port: "8080" #新增
spec:
selector:
k8s-app: traefik-ingress-lb
ports:
- protocol: TCP
port: 80
name: web
- protocol: TCP
port: 8080
name: admin
type: NodePort
以后我们有了新的服务,如果服务本身提供了/metrics接口,我们就完全不需要用静态的方式去配置了
开启服务动态发现后,默认自动动态发现并监控的服务如下:prometheus本身,kube-dns
4、自动发现kube-state-metrics
实现在 Kubernetes 集群上 Pod、DaemonSet、Deployment、Job、CronJob 等各种资源对象的状态的监控
$ git clone https://github.com/kubernetes/kube-state-metrics.git
$ cd kube-state-metrics/kubernetes
$ kubectl apply -f .
将 kube-state-metrics 部署到 Kubernetes 上之后,就会发现 Kubernetes 集群中的 Prometheus 会在kubernetes-service-endpoints 这个 job 下自动服务发现 kube-state-metrics,并开始拉取 metrics,这是因为部署 kube-state-metrics 的 manifest 定义文件 kube-state-metrics-service.yaml 对 Service 的定义包含prometheus.io/scrape: ‘true‘这样的一个annotation