企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

文章目录

k8s容器资源限制

内存限制示例

如果容器超过其内存限制,则会被终止。如果可重新启动,则与所有其他类型的运行时故障一样,kubelet 将重新启动它。

如果一个容器超过其内存请求,那么当节点内存不足时,它的 Pod 可能被逐出。

vim memory.yaml

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
spec:
  containers:
  - name: memory-demo
    image: stress
    args:
    - --vm
    - "1"
    - --vm-bytes
    - 200M
    resources:
      requests:
        memory: 50Mi
      limits:
        memory: 100Mi

运行这个yaml

[root@server2 limit]# kubectl apply -f memory.yaml
pod/memory-demo created

可以发现限制的最大内存为100M,然后需要的是200M内存,状态是OOMKilled
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
spec:
  containers:
  - name: cpu-demo
    image: stress
    resources:
      limits:
        cpu: "10"
      requests:
        cpu: "5"
    args:
    - -c
    - "2"

应用yaml

kubectl apply -f cpu.yaml

可以看出cpu-demo一直处于Pending状态
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

kubernetes资源监控

Metrics-Server部署

Metrics-Server是集群核心监控数据的聚合器,用来替换之前的heapster。
容器相关的 Metrics 主要来自于 kubelet 内置的 cAdvisor 服务,有了Metrics-Server之后,用户就可以通过标准的 Kubernetes API 来访问到这些监控数据。
Metrics API 只可以查询当前的度量数据,并不保存历史数据。
Metrics API URI 为 /apis/metrics.k8s.io/,在 k8s.io/metrics 维护。
必须部署 metrics-server 才能使用该 API,metrics-server 通过调用 Kubelet Summary API 获取数据。
Metrics Server 并不是 kube-apiserver 的一部分,而是通过 Aggregator 这种插件机制,在独立部署的情况下同 kube-apiserver 一起统一对外服务的。
kube-aggregator 其实就是一个根据 URL 选择具体的 API 后端的代理服务器。
Metrics-server属于Core metrics(核心指标),提供API metrics.k8s.io,仅提供Node和Pod的CPU和内存使用情况

首先先下载Metrics-Server的资源清单
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

编辑下载的yaml文件
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路
Metrics-server部署

[root@server2 metrics]# kubectl apply -f components.yaml 
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
[root@server2 metrics]# kubectl get pod -n kube-system  | grep metrics
metrics-server-86d6b8bbcc-lrdfh           0/1     Running   0          79s

可以从上面看出,metrics-server-86d6b8bbcc-lrdfh虽然在running,但是没有ready

我们可以在部署后查看Metrics-server的Pod日志

kubectl -n kube-system logs metrics-server-86d6b8bbcc-lrdfh

原因:出现了x509的错误
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路
Metric Server 支持一个参数 --kubelet-insecure-tls,可以跳过这一检查,然而官方也明确说了,这种方式不推荐生产使用。

我们使用 启用TLS Bootstrap 证书签发 来解决这个问题

在k8s所有集群主机中:
在k8s所有集群主机中:
在k8s所有集群主机中:

vim /var/lib/kubelet/config.yaml
	serverTLSBootstrap: true  #在最后一行加入
systemctl  restart kubelet
kubectl get csr  #查看证书签名请求

看到所有的csr的状况都是pending

[root@server2 metrics]# kubectl get csr
NAME        AGE   SIGNERNAME                      REQUESTOR             CONDITION
csr-59zz9   3s    kubernetes.io/kubelet-serving   system:node:server3   Pending
csr-8d2rt   3s    kubernetes.io/kubelet-serving   system:node:server4   Pending
csr-chz2d   7s    kubernetes.io/kubelet-serving   system:node:server2   Pending
kubectl certificate approve #证书批准
[root@server2 metrics]# kubectl certificate approve csr-59zz9 csr-8d2rt  csr-chz2d
certificatesigningrequest.certificates.k8s.io/csr-59zz9 approved
certificatesigningrequest.certificates.k8s.io/csr-8d2rt approved
certificatesigningrequest.certificates.k8s.io/csr-chz2d approved
[root@server2 metrics]# kubectl get csr
NAME        AGE    SIGNERNAME                      REQUESTOR             CONDITION
csr-59zz9   118s   kubernetes.io/kubelet-serving   system:node:server3   Approved,Issued
csr-8d2rt   118s   kubernetes.io/kubelet-serving   system:node:server4   Approved,Issued
csr-chz2d   2m2s   kubernetes.io/kubelet-serving   system:node:server2   Approved,Issued

再次查看metrics的pod

kubectl get pod -n kube-system

可以看到已经ready

[root@server2 metrics]# kubectl get pod -n kube-system  | grep metrics
metrics-server-86d6b8bbcc-lrdfh           1/1     Running   0          35m

部署成功后可以看到

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes/server2"
kubectl top node
[root@server2 metrics]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes/server2"
{"kind":"NodeMetrics","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"name":"server2","creationTimestamp":"2021-08-03T16:42:07Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"server2","kubernetes.io/os":"linux","node-role.kubernetes.io/control-plane":"","node-role.kubernetes.io/master":"","node.kubernetes.io/exclude-from-external-load-balancers":""}},"timestamp":"2021-08-03T16:41:48Z","window":"10s","usage":{"cpu":"167373877n","memory":"1410456Ki"}}
[root@server2 metrics]# kubectl top node
W0804 00:42:13.983722   24949 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME      CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
server2   172m         8%     1378Mi          72%       
server3   66m          6%     570Mi           64%       
server4   79m          7%     599Mi           67%   

补充:
错误1:dial tcp: lookup server2 on 10.96.0.10:53: no such host

这是因为没有内网的DNS服务器,所以metrics-server无法解析节点名字。可以直接修改coredns的configmap,讲各个节点的主机名加入到hosts中,这样所有Pod都可以从CoreDNS中解析各个节点的名字。

kubectl edit configmap coredns -n kube-system
apiVersion: v1
data:
  Corefile: |
    ...
        ready
        hosts {
           ip nodename
           ip nodename
           ip nodename
           fallthrough
        }
        kubernetes cluster.local in-addr.arpa ip6.arpa {

错误2:Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
如果metrics-server正常启动,没有错误,应该就是网络问题。修改metrics-server的Pod 网络模式:
hostNetwork: true

Dashboard部署

Dashboard可以给用户提供一个可视化的 Web 界面来查看当前集群的各种信息。用户可以用 Kubernetes Dashboard 部署容器化的应用、监控应用的状态、执行故障排查任务以及管理 Kubernetes 各种资源。

网址:https://github.com/kubernetes/dashboard

下载部署文件:https://raw.githubusercontent.com/kubernetes/dashboard/v2.3.1/aio/deploy/recommended.yaml

提前在harbor仓库上传镜像
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路直接运用下载的部署文件,进行Dashboard部署

[root@server2 dashboard]# kubectl apply -f recommended.yaml 
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
[root@server2 dashboard]# kubectl  -n kubernetes-dashboard get svc
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
dashboard-metrics-scraper   ClusterIP   10.105.154.129   <none>        8000/TCP   6s
kubernetes-dashboard        ClusterIP   10.102.109.205   <none>        443/TCP    7s

查看到创建出的svc的type为clusterIP修改为LoadBalancer方式,以便外部访问

[root@server2 dashboard]# kubectl -n kubernetes-dashboard  edit svc kubernetes-dashboard 
service/kubernetes-dashboard edited

企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路
可以看到已经分配到了一个外部IP 172.25.21.11
如果状态是pending 可以查看https://blog.csdn.net/Puuwuuchao/article/details/119172011#t5这篇文章的LoadBalancer部分
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

登陆dashboard需要认证,需要获取dashboard pod的token:

企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

kubectl -n kubernetes-dashboard get secrets
kubectl -n kubernetes-dashboard describe secrets kubernetes-dashboard-token-k27nb

企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路
使用token进入后,会出现RBAC的问题
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路
默认dashboard对集群没有操作权限,需要授权

vim rbac.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
[root@server2 dashboard]# kubectl apply -f rbac.yaml 
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-admin created

可以看到已完成部署
企业运维实战之k8s(容器资源限制和资源监控)初学者必看,点赞关注后期不迷路

上一篇:SCSS随笔-mixin与@extend


下一篇:CSS变量(自定义属性)实践指南