K8S-Statefulset(有状态负载)原理和实践

一、什么是有状态负载(Statufulset)?

StatefulSet 主要用于管理有状态的应用,它创建的Pod有持久型的标识符,即便Pod被调度的集群中不同的node节点或销毁重启后,标识符任然会保留,另外,支持Pod实例有序的部署和删除,它有如下特点:

1、Pod一致性:PodName、HostName、Pod的启动和停止的顺序在运行的过程中会保持一致

2、稳定的存储:通过VolumeClaimTemplate为每个Pod创建一个PVC和PV,即使删除掉Pod或进行缩容,不会删掉卷,当重启或者扩容后会自动将之前的卷进行挂载,这样就可以保证Pod有稳定的存储

3、稳定的网络:Statufulset结合headless service会给个创建的Pod配置一个DNS,其格式为(podname).(headless server name).namespace.svc.cluster.local,Pod实例之间可以通过域名进行访问

4、稳定的次序:即Pod是有顺序的,在部署或者扩展的时候要依据定义的顺序依次依次进行(即从0到N-1,在下一个Pod运行之前所有之前的Pod必须都是Running和Ready状态),删除或缩容的时候,会从N-1到0

二、Statufulset的使用场景

在应用中对上文Statufulset的特点有需求的可以考虑使用Statufulset,在实际的应用中,经常在分布式应用中使用,如多个mysql实例,各个实例之间有其对应关系,如:主从、主备,对数据的持久化保存、启动顺序、以及实例之间相互访问的场景。

四、Statufulset的创建和使用

 官方推荐的创建Statufulset的顺序为:创建PV->创建PVC->创建Headless Service->创建StatufulSet,读者可能会好奇,为什么需要需要PV、PVC和Headless Service?

1、为什么需要PV和PVC?

创建PV和PVC然后挂载到Pod的容器中,实现数据持久化的保存吗,本文采用的静态创建PVC进行讲述,更方便的做法是采用storageclass动态创建存储卷,这样可以减少集群管理员创建PV这个过程,这里另外的文章再详细描述。

2、为什么需要Headless Service?

笔者另外的的一篇文章“K8S-Serivce的原理和实践”详细介绍了的Headless Service的创建和特点,通过对Headless Service的名称进行域名解析后会返回后端所有的Pod的IP,而通过Statufulset副本控制器创建的每个Pod都会其配置一个DNS,这个域名的格式为:(podname).(headless server name).namespace.svc.cluster.local,从这里可以得知为什么需要先创建Headless Service,一个作用是返回后端所有的Pod以便Headless Service为每个Pod配置DNS,Pod配置DNS的时候会将Headless Service的名称作为Pod域名的一部分,可以测试下如果少了Service,StatufulSet能否创建成功

[root@k8s-master zhanglei]# cat statesfulset-test.yaml 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: myapp-statefulset
spec:
#  serviceName: myapp-headless-service
  replicas: 2
  selector:
    matchLabels:
      app: myapp-pod
  template:
    metadata:
      labels:
        app: myapp-pod
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v1
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: myappdata-pvc
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: myappdata-pvc
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 0.05Gi

[root@k8s-master zhanglei]# kubectl create -f sts-testservice.yaml
error: error validating "sts-testservice.yaml": error validating data: ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1.StatefulSetSpec; if you choose to ignore these errors, turn validation off with --validate=false

在yaml文件中,注释掉serviceName后,再执行创建的操作会报错:缺少必填字段serviceName,可以看出若未指定此字段,则Statufulset不会创建成功。

在笔者前面的文章中K8S-PV和PVC的原理和实践介绍了PV和PVC的创建过程、在“K8S-Serivce的原理和实践”介绍了Headless Service的创建过程,这里都不再进行赘述看下已经创建好的PV、PVC和Headless Service

[root@k8s-master zhanglei]# kubectl get pv |grep pv-statefulset
pv-statefulset-03            107374182400m   RWO            Recycle          Bound    default/myappdata-pvc-myapp-statefulset-0                           15d
pv-statefulset-04            107374182400m   RWO            Recycle          Bound    default/myappdata-pvc-myapp-statefulset-1                           14d
[root@k8s-master zhanglei]# kubectl get pvc |grep myappdata
myappdata-pvc-myapp-statefulset-0   Bound    pv-statefulset-03            107374182400m   RWO                           14d
myappdata-pvc-myapp-statefulset-1   Bound    pv-statefulset-04            107374182400m   RWO                           14d

 

[root@k8s-master zhanglei]# cat headless-svc-stu.yaml 
apiVersion: v1
kind: Service
metadata:
  name: myapp-headless-service
  labels:
    app: statefulset
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: myapp-pod

创建Statufulset

[root@k8s-master zhanglei]# cat statesfulset-test.yaml 
apiVersion: apps/v1
kind: StatefulSet                           
metadata:
  name: myapp-statefulset      
spec:
  serviceName: myapp-headless-service      # 指定已经创建成功的headless Service
  replicas: 2                              # 指定期望副本数为2
  selector:
    matchLabels:
      app: myapp-pod 
  template:
    metadata:
      labels:
        app: myapp-pod
    spec:
      containers:
      - name: myapp
        image: ikubernetes/myapp:v1
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: myappdata-pvc               
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:                             # 数据持久化声明
  - metadata:
      name: myappdata-pvc
    spec:
      accessModes: [ "ReadWriteOnce" ]             # 声明访问模式
      resources:
        requests:  
          storage: 0.05Gi                          # 声明容量
[root@k8s-master zhanglei]# kubectl get sts myapp-statefulset  -o wide
NAME                READY   AGE   CONTAINERS   IMAGES
myapp-statefulset   2/2     14d   myapp        ikubernetes/myapp:v1
[root@k8s-master zhanglei]# kubectl describe sts myapp-statefulset
Name:               myapp-statefulset
Namespace:          default
CreationTimestamp:  Sat, 23 May 2020 18:25:02 +0800
Selector:           app=myapp-pod
Labels:             <none>
Annotations:        <none>
Replicas:           2 desired | 2 total
Update Strategy:    RollingUpdate
  Partition:        0
Pods Status:        2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=myapp-pod
  Containers:
   myapp:
    Image:        ikubernetes/myapp:v1
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /usr/share/nginx/html from myappdata-pvc (rw)
  Volumes:  <none>
Volume Claims:
  Name:          myappdata-pvc
  StorageClass:  
  Labels:        <none>
  Annotations:   <none>
  Capacity:      53687091200m
  Access Modes:  [ReadWriteOnce]
Events:          <none>

看下Pod的的状态,如下所示,是Running状态

[root@k8s-master zhanglei]# kubectl get pod -o wide | grep myapp-statefulset 
myapp-statefulset-0                    1/1     Running            0          5d21h   10.122.235.239   k8s-master   <none>           <none>
myapp-statefulset-1                    1/1     Running            0          112m    10.122.235.253   k8s-master   <none>           <none>

验证域名:在前面提到Statufulset副本控制器结合headless service会为每个创建的Pod配置一个DNS域名,先接解析headless service的名称返回

[root@k8s-master zhanglei]# dig -t A myapp-headless-service.default.svc.cluster.local. @10.10.0.10

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el8 <<>> -t A myapp-headless-service.default.svc.cluster.local. @10.10.0.10
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22543
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 8e8a5971efec82f4 (echoed)
;; QUESTION SECTION:
;myapp-headless-service.default.svc.cluster.local. IN A

;; ANSWER SECTION:                                                              # 返回了有状态负载创建的所有Pod
myapp-headless-service.default.svc.cluster.local. 30 IN    A 10.122.235.253          
myapp-headless-service.default.svc.cluster.local. 30 IN    A 10.122.235.239

;; Query time: 13 msec
;; SERVER: 10.10.0.10#53(10.10.0.10)
;; WHEN: 日 6月 07 18:05:46 CST 2020
;; MSG SIZE  rcvd: 217

可以看到通过对headless service的名称的域名解析后返回了所有的Pod的列表,再对单个的Pod的进行域名解析

[root@k8s-master zhanglei]# dig -t A myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. @10.10.0.10                #对Pod-0进行域名解析

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el8 <<>> -t A myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. @10.10.0.10
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46972
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: d930083e06cfaca9 (echoed)
;; QUESTION SECTION:
;myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. IN A

;; ANSWER SECTION:
myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. 30 IN A 10.122.235.239          # 返回了其IP地址

;; Query time: 19 msec
;; SERVER: 10.10.0.10#53(10.10.0.10)
;; WHEN: 日 6月 07 18:09:18 CST 2020
;; MSG SIZE  rcvd: 193

同样也可以对myapp-statefulset-1这个Pod进行域名解析会返回此Pod的IP,2个Pod实例之间可以通过域名进行访问,适合数据库的主、从Pod实例互相访问的场景。

验证服务的稳定性:

[root@k8s-master zhanglei]# kubectl describe pod myapp-statefulset-0 | grep ClaimName
    ClaimName:  myappdata-pvc-myapp-statefulset-0
[root@k8s-master zhanglei]# kubectl delete pod myapp-statefulset-0
pod "myapp-statefulset-0" deleted
[root@k8s-master zhanglei]# kubectl get pod
NAME                                   READY   STATUS             RESTARTS   AGE
myapp-statefulset-0                    1/1     Running            0          14s
myapp-statefulset-1                    1/1     Running            0          129m

删除Pod后,重新创建的Pod名字与删除的一致,且使用同一个PVC,Pod的名称保持了一致性,因为使用还是原来的PVC,因此数据并未丢失,实现了持久化。

验证扩缩容的顺序:

现在是2个Pod,先缩容到1个,如下所示,可以看到缩容后停止的是myapp-statefulset-1 Pod,即验证先从序号为N-1开始删除,以N-1到0的顺序

[root@k8s-master zhanglei]# kubectl get sts
NAME                READY   AGE
myapp-statefulset   2/2     14d
[root@k8s-master zhanglei]# kubectl scale sts myapp-statefulset --replicas=1
statefulset.apps/myapp-statefulset scaled
[root@k8s-master zhanglei]# kubectl get pod |grep myapp
myapp-statefulset-0                    1/1     Running            0          5m43s
[root@k8s-master zhanglei]# kubectl get pvc|grep myappdata-pvc-myapp-statefulset
myappdata-pvc-myapp-statefulset-0   Bound    pv-statefulset-03            107374182400m   RWO                           15d
myappdata-pvc-myapp-statefulset-1   Bound    pv-statefulset-04            107374182400m   RWO                           15d

虽然对Pod进行了缩容,但是之前挂载在myapp-statefulset-1 Pod上的PVC卷并未删除,保留了历史数据,再扩容到3个Pod

[root@k8s-master zhanglei]# kubectl get pod |grep myapp-statefulset
myapp-statefulset-0                    1/1     Running            0          13m
myapp-statefulset-1                    1/1     Running            0          52s
myapp-statefulset-2                    0/1     Pending            0          49s

可以看到其扩容的创建Pod的顺序为0,1,2,其中myapp-statefulset-2还处于Pending状态,它会等myapp-statefulset-1为Running状态后才会执行创建

验证volume共享:

[root@k8s-master zhanglei]# kubectl describe pv pv-statefulset-testservice
Name:            pv-statefulset-testservice
Labels:          release=stable
Annotations:     pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    
Status:          Bound
Claim:           default/myappdata-pvc-myapp-statefulset-2
Reclaim Policy:  Recycle
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        107374182400m
Node Affinity:   <none>
Message:         
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /data/pod/volume7                          # 宿主机的目录
    HostPathType:  
Events:
  Type    Reason          Age   From                         Message
  ----    ------          ----  ----                         -------
  Normal  RecyclerPod     11m   persistentvolume-controller  Recycler pod: Successfully assigned default/recycler-for-pv-statefulset-testservice to k8s-master
  Normal  RecyclerPod     11m   persistentvolume-controller  Recycler pod: Pulling image "busybox:1.27"
  Normal  RecyclerPod     11m   persistentvolume-controller  Recycler pod: Successfully pulled image "busybox:1.27"
  Normal  RecyclerPod     11m   persistentvolume-controller  Recycler pod: Created container pv-recycler
  Normal  RecyclerPod     11m   persistentvolume-controller  Recycler pod: Started container pv-recycler
  Normal  VolumeRecycled  11m   persistentvolume-controller  Volume recycled

 登录到容器共享目录 /usr/share/nginx/html(describe Pod Mounts可查看)目录下创建1个sts.txt文件

[root@k8s-master zhanglei]# kubectl exec -it myapp-statefulset-2 -- sh
/ # ls
bin    dev    etc    home   lib    media  mnt    proc   root   run    sbin   srv    sys    tmp    usr    var
/ # cd /usr/share/nginx/html
/usr/share/nginx/html # touch sts.txt

回到宿主机目录下验证该文件是否同步到宿主机/data/pod/volume7下,可以看到已完成了同步,验证完成,另外在宿主机此目录下的写入也会同步到容器映射目录。

[root@k8s-master volume7]# ls
sts.txt

五、总结

 StatufulSet非常适合类似数据库实例部署等对数据持久性、启动顺序、实例之间相互访问的场景,在创建的过程中要注意创建顺序:创建PV->创建PVC->创建Headless Service->创建StatufulSet。

上一篇:React-Native


下一篇:uriworkermap.properties配置