使用云盘存储卷,往往在服务初始化的时候申请了一个适当容量的云盘,但是随着数据的增长,数据盘的容量不能满足需求,需要扩容。
传统应用的扩容场景中,往往是先手动停掉应用,再对数据盘进行备份,然后执行扩容操作,最后重新启动应用。
Kubernetes本身是一个自动化调度、编排系统,实现了对数据卷的生命周期管理。在K8S 1.14中,CSI数据卷扩容属于Alpha阶段,需要开启Feature Gates才可以使用;
本文描述在CSI环境中如何进行云盘的动态扩容:
使用说明:
1. 数据备份:
切记:做数据卷扩容前,先对云盘打快照备份,以防扩容过程异常导致数据出现问题;
2. 集群依赖:
对云盘扩容操作需要调用云盘扩容相应API,所以需要集群具有此API的调用权限,可以参考集群权限文档为集群添加此权限;参考详细步骤。
3. 数据卷限制:
只有动态存储卷才可以进行数据卷动态扩容,即配置了StorageClassName的PV;
不支持InlineVolume类型(非PV、PVC方式)云盘数据卷扩容;
普通云盘类型不支持动态扩容,请参考使用手动扩容云盘方案;
3. 对StorageClass的要求:
PVC配置的StorageClass为阿里云云盘类型,provisioner为diskplugin.csi.alibabacloud.com;
StorageClass需要配置:AllowVolumeExpansion: True,ACK集群默认为True;
依赖准备
申请ACK集群(大于等于1.14版本)阿里云Kubernetes集群(申请集群时选择CSI存储插件);
1. 配置Feature Gate(针对K8S1.14集群):
由于在K8S 1.14中,resize还是Alpha的Feature,需要增加如下配置:
更新kube-controller-manager 添加Feature Gate:
/etc/kubernetes/manifests/kube-controller-manager.yaml
更新kubelet(如果节点较多,可以写脚本实现):
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
service kubelet restart
feature gates:--feature-gates=ExpandCSIVolumes=true
2. 集群添加扩容权限:
给云盘扩容需要为集群的”Worker RAM 角色“添加ResizeDisk权限:
专有集群:
在集群 --> 管理 --> 集群资源 点击”Master RAM 角色“;编辑Ram权限,添加ResizeDisk如下图:
托管集群:
在集群 --> 管理 --> 集群资源 点击”Worker RAM 角色“;编辑Ram权限,添加ResizeDisk如下图:
3. resizer插件部署(针对K8S1.14集群):
参考以下模板:
kind: Service
apiVersion: v1
metadata:
name: csi-resizer
namespace: kube-system
labels:
app: csi-resizer
spec:
selector:
app: csi-resizer
ports:
- name: dummy
port: 12345
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
name: csi-resizer
namespace: kube-system
spec:
serviceName: "csi-resizer"
selector:
matchLabels:
app: csi-resizer
template:
metadata:
labels:
app: csi-resizer
spec:
tolerations:
- operator: "Exists"
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
priorityClassName: system-node-critical
serviceAccount: admin
hostNetwork: true
containers:
- name: csi-resizer
image: registry.cn-hangzhou.aliyuncs.com/acs/csi-resizer:v0.3.0
args:
- "--v=5"
- "--csi-address=$(ADDRESS)"
- "--leader-election"
env:
- name: ADDRESS
value: /socketDir/csi.sock
imagePullPolicy: "Always"
volumeMounts:
- name: socket-dir
mountPath: /socketDir/
- name: csi-diskplugin
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: registry.cn-hangzhou.aliyuncs.com/acs/csi-plugin:v1.14.8.32-c77e277b-aliyun
imagePullPolicy: "Always"
args:
- "--endpoint=$(CSI_ENDPOINT)"
- "--v=5"
- "--driver=diskplugin.csi.alibabacloud.com"
env:
- name: CSI_ENDPOINT
value: unix://socketDir/csi.sock
volumeMounts:
- mountPath: /var/log/
name: host-log
- mountPath: /socketDir/
name: socket-dir
- name: etc
mountPath: /host/etc
volumes:
- name: socket-dir
emptyDir: {}
- name: host-log
hostPath:
path: /var/log/
- name: etc
hostPath:
path: /etc
updateStrategy:
type: RollingUpdate
云盘卷扩容:
1. 创建应用
创建nginx应用,并给Pod挂载一个20G的云盘数据卷,PVC、Deploy的模板如下:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-disk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: alicloud-disk-ssd
apiVersion: apps/v1
kind: Deployment
metadata:
name: dynamic-create
labels:
app: nginx
spec:
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- name: disk-pvc
mountPath: "/data"
volumes:
- name: disk-pvc
persistentVolumeClaim:
claimName: pvc-disk
当前应用状态如下:
Pod挂载的云盘大小为20G;
# kubectl get pod
NAME READY STATUS RESTARTS AGE
dynamic-create-857bd875b5-n82d4 1/1 Running 0 107s
# kubectl exec -ti dynamic-create-857bd875b5-n82d4 df | grep data
/dev/vdb 20511312 45080 20449848 1% /data
pvc、pv的大小都显示为20G;
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-disk Bound d-wz9g8sl8dl1ks8hz2m82 20Gi RWO alicloud-disk-ssd 2m17s
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
d-wz9g8sl8dl1ks8hz2m82 20Gi RWO Delete Bound default/pvc-disk alicloud-disk-ssd 2m15s
2. 云盘卷扩容:
扩容云盘执行下面命令:
# kubectl patch pvc pvc-disk -p '{"spec":{"resources":{"requests":{"storage":"30Gi"}}}}'
更新pvc大小,会驱动Resizer调用云盘api进行扩容,控制台可以检查云盘已经变成了30G,且pv的size也更新到30G;
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-disk Bound d-wz9g8sl8dl1ks8hz2m82 20Gi RWO alicloud-disk-ssd 13m
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
d-wz9g8sl8dl1ks8hz2m82 30Gi RWO Delete Bound default/pvc-disk alicloud-disk-ssd 13m
此时只完成了云盘的扩容,文件系统的扩容没有做,所以容器内的存储空间依然是20G;
# kubectl exec -ti dynamic-create-857bd875b5-n82d4 df /data
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdb 20511312 45080 20449848 1% /data
通过删除Pod触发文件系统扩容:
# kubectl delete pod dynamic-create-857bd875b5-n82d4
pod "dynamic-create-857bd875b5-n82d4" deleted
# kubectl get pod
NAME READY STATUS RESTARTS AGE
dynamic-create-857bd875b5-4gng9 1/1 Running 0 38s
可见文件系统已经扩容到30G:
# kubectl exec -ti dynamic-create-857bd875b5-4gng9 df /data
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/vdb 30832548 45036 30771128 1% /data
以上步骤即完成了一个CSI环境下云盘扩容的步骤: