1、概念
在定义pod时,可以在spec.priorityClassName中指定PriorityClass,根据其中定义的优先级在PrioritySort中排序,优先调度优先级高的pod,优先级相同的pod根据进入队列的时间戳先后调度,当未找到
合适的运行节点时,调度器会将POD转为pending状态,并为其启动“抢占”过程,在集群中删除一个或者多个低优先的POD,让节点满足该优先级高的POD调度。
pod优先级使用32位正整数,可用值为[0,1000000000],值越大优先级越高,大于1000000000的优先级预留给系统级的关键pod,以防止这些pod被驱逐。
例如:API-server,Controller-manager,Scheduler和etcd 的pod直接使用system-cluster-critical PriorityClass,优先级为:2000000000
metric-server,CoreDNS、Dashboard等使用system-node-critical 的pod使用system-node-critical PriorityClass,优先级为:2000001000
在定义pod时不定义PriorityClass时,默认优先级的值为:0
2、PriorityClass定义
如果集群上存在多个设定了全局默认的优先级的PriorityClass对象,优先级小的会生效,如:
定义默认优先级的demoappv11
apiVersion: apps/v1 kind: Deployment metadata: name: demoappv11 spec: replicas: 4 selector: matchLabels: app: demoappv11 template: metadata: labels: app: demoappv11 spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - {key: app, operator: In, values: ["demoappv11"]} topologyKey: kubernetes.io/hostname containers: - name: demoappv11 image: harbor.myland.com/baseimages/centos/centos-tsinghua:7.9.2009 command: ["sleep","1000"] resources: requests: memory: 2Gi cpu: 150mView Code
优先级为222222的demoappv12
kind: PriorityClass apiVersion: scheduling.k8s.io/v1 metadata: name: demoappv12 value: 222222 description: "demoappv12" globalDefault: false preemptionPolicy: PreemptLowerPriority --- apiVersion: apps/v1 kind: Deployment metadata: name: demoappv12 spec: replicas: 4 selector: matchLabels: app: demoappv12 template: metadata: labels: app: demoappv12 spec: priorityClassName: demoappv12 affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - {key: app, operator: In, values: ["demoappv12"]} topologyKey: kubernetes.io/hostname containers: - name: demoappv12 image: ikubernetes/demoapp:v1.2 resources: requests: memory: 2Gi cpu: 150mView Code
由于内存不足,调度器开始清理优先级的demoappv11
直到demoappv11处于pending状态