阿里云容器服务Kubernetes版(Alibaba Cloud Container Service for Kubernetes,简称容器服务ACK)是全球首批通过Kubernetes一致性认证的服务平台,提供高性能的容器应用管理服务,支持企业级Kubernetes容器化应用的生命周期管理,让您轻松高效地在云端运行Kubernetes容器化应用。ACK包含了专有版Kubernetes(Dedicated Kubernetes)、托管版Kubernetes(Managed Kubernetes)、Serverless Kubernetes三种形态,本文我们将完成一个ACK托管集群的创建,同时,我们将把serverless kubernetes的能力以插件的方式装到这个ACK集群中,实现一个由ACK+ECI共同打造的弹性容器集群。
前提条件
- 首次使用时,您需要开通容器服务ACK,并为其授权相应云资源的访问权限。
- 登录容器服务ACK开通页面。
- 阅读并选中容器服务ACK服务协议。
- 单击立即开通。
- 登录容器服务管理控制台。
- 在容器服务需要创建默认角色页面,单击前往RAM进行授权进入云资源访问授权页面,然后单击同意授权。完成以上授权后,刷新控制台即可使用容器服务ACK。
- 请提前在一个阿里云一个region创建好VPC,同时选择1个可用区,并在其中创建2个vswitch,一个给node使用,一个给pod使用。
创建ACK集群
创建一个ACK集群可以由ACK控制台,OpenAPI, Terraform 等多种方式来完成,这里我们将提供一个控制台创建集群的配置方法,同时也会提供一个terraform脚本让大家体验下阿里云上的基础架构即代码的能力。
方式一(控制台UI创建):
- 创建集群时涉及到的配置项及其解释,可参考:https://help.aliyun.com/document_detail/95108.html
- 集群配置
- 集群名称:bj-workshop
- 集群规格:Pro版
- 地域:北京
- 付费类型:按量付费
- kubernetes版本:1.20.11
- 容器运行时:docker
- 选择专有网络
- 网络插件: Terway
- 虚拟交换机:node vswitch
- pod虚拟交换机: pod vswitch
- service cidr:192.168.0.0/16
- 配置SNAT
- APIServer:内网+slb.s1.small
- 安全组:自动创建企业安全组
3. 节点池配置
- 实例规格:ecs.g6e.xlarge
- 数量:2
- 系统盘:essd,40G
- 数据盘:essd, 120G
操作系统:Alibaba Cloud Linux
密码:Just4Test
- 组件配置
- ingress:nginx Ingress+私网+slb规格slb.s1.small
- 存储:取消 “创建默认NAS文件系统和CNFS容器网络文件系统动态存储类型”
5. 确认配置,如果有依赖检查未通过,请先行检查原因并通过依赖检查,勾选服务协议,开始创建
6. 托管集群创建大概需要15分钟
方式二(terraform创建):
采用terraform进行ACK集群的创建,本示例中,我们直接由terraform创建vpc,vswitch,ack等全部依赖的云资源
- 安装terraform,参考:https://learn.hashicorp.com/tutorials/terraform/install-cli
- 将下列文件拷贝到工作目录中:
main.tf
## specify the cloud provider aliyun/alicloud version terraform { required_providers { alicloud = { source = "aliyun/alicloud" version = "1.141.0" } } } # If there is not specifying vpc_id, the module will launch a new vpc resource "alicloud_vpc" "vpc" { count = var.vpc_id == "" ? 1 : 0 cidr_block = var.vpc_cidr } # According to the vswitch cidr blocks to launch several vswitches resource "alicloud_vswitch" "vswitches" { count = length(var.vswitch_ids) > 0 ? 0 : length(var.vswitch_cidrs) vpc_id = var.vpc_id == "" ? join("", alicloud_vpc.vpc.*.id) : var.vpc_id cidr_block = element(var.vswitch_cidrs, count.index) zone_id = element(var.zone_id, count.index) } # According to the vswitch cidr blocks to launch several vswitches resource "alicloud_vswitch" "terway_vswitches" { count = length(var.terway_vswitch_ids) > 0 ? 0 : length(var.terway_vswitch_cirds) vpc_id = var.vpc_id == "" ? join("", alicloud_vpc.vpc.*.id) : var.vpc_id cidr_block = element(var.terway_vswitch_cirds, count.index) zone_id = element(var.zone_id, count.index) } resource "alicloud_cs_managed_kubernetes" "k8s" { kube_config = var.kube_config count = var.k8s_number # version can not be defined in variables.tf. Options: 1.18.8-aliyun.1|1.20.11-aliyun.1 version = "1.20.11-aliyun.1" # name_prefix = "terraform_" name = "bj-workshop" is_enterprise_security_group = true cluster_spec = "ack.pro.small" worker_vswitch_ids = length(var.vswitch_ids) > 0 ? split(",", join(",", var.vswitch_ids)): length(var.vswitch_cidrs) < 1 ? [] : split(",", join(",", alicloud_vswitch.vswitches.*.id)) pod_vswitch_ids = length(var.terway_vswitch_ids) > 0 ? split(",", join(",", var.terway_vswitch_ids)): length(var.terway_vswitch_cirds) < 1 ? [] : split(",", join(",", alicloud_vswitch.terway_vswitches.*.id)) worker_instance_types = var.worker_instance_types worker_disk_category = "cloud_essd" worker_disk_size = 40 worker_data_disks { category = "cloud_essd" size = "100" encrypted = false performance_level = "PL0" } worker_number = var.worker_number node_cidr_mask = var.node_cidr_mask enable_ssh = var.enable_ssh install_cloud_monitor = var.install_cloud_monitor cpu_policy = var.cpu_policy proxy_mode = var.proxy_mode password = var.password service_cidr = var.service_cidr dynamic "addons" { for_each = var.cluster_addons content { name = lookup(addons.value, "name", var.cluster_addons) config = lookup(addons.value, "config", var.cluster_addons) } } # runtime = { # name = "docker" # version = "19.03.5" # } }
variables.tf
# 引入阿里云 Terraform Provider provider "alicloud" { # 填入想创建的 Region region = "cn-beijing" } variable "k8s_number" { description = "The number of kubernetes cluster." default = 1 } variable "zone_id" { description = "The availability zones of vswitches." default = ["cn-beijing-h","cn-beijing-i","cn-beijing-j"] } # leave it to empty would create a new one variable "vpc_id" { description = "Existing vpc id used to create several vswitches and other resources." default = "" } variable "vpc_cidr" { description = "The cidr block used to launch a new vpc when 'vpc_id' is not specified." default = "10.0.0.0/8" } # leave it to empty then terraform will create several vswitches variable "vswitch_ids" { description = "List of existing vswitch id." type = list(string) default = [] } variable "vswitch_cidrs" { description = "List of cidr blocks used to create several new vswitches when 'vswitch_ids' is not specified." type = list(string) default = ["10.1.0.0/16","10.2.0.0/16","10.3.0.0/16"] } variable "new_nat_gateway" { description = "Whether to create a new nat gateway. In this template, a new nat gateway will create a nat gateway, eip and server snat entries." default = "true" } # 3 masters is default settings,so choose three appropriate instance types in the availability zones above. # variable "master_instance_types" { # description = "The ecs instance types used to launch master nodes." # default = ["ecs.n4.xlarge","ecs.n4.xlarge","ecs.sn1ne.xlarge"] # } variable "worker_instance_types" { description = "The ecs instance types used to launch worker nodes." default = ["ecs.g6e.xlarge"] #default = ["ecs.g5ne.2xlarge","ecs.sn1ne.xlarge","ecs.n4.xlarge"] } # options: between 24-28 variable "node_cidr_mask" { description = "The node cidr block to specific how many pods can run on single node." default = 24 } variable "enable_ssh" { description = "Enable login to the node through SSH." default = true } variable "install_cloud_monitor" { description = "Install cloud monitor agent on ECS." default = true } # options: none|static variable "cpu_policy" { description = "kubelet cpu policy.default: none." default = "none" } # options: ipvs|iptables variable "proxy_mode" { description = "Proxy mode is option of kube-proxy." default = "ipvs" } variable "password" { description = "The password of ECS instance." default = "Just4Test" } variable "worker_number" { description = "The number of worker nodes in kubernetes cluster." default = 2 } variable "service_cidr" { description = "The kubernetes service cidr block. It cannot be equals to vpc's or vswitch's or pod's and cannot be in them." default = "172.21.0.0/20" } variable "terway_vswitch_ids" { description = "List of existing vswitch ids for terway." type = list(string) default = [] } variable "terway_vswitch_cirds" { description = "List of cidr blocks used to create several new vswitches when 'terway_vswitch_ids' is not specified." type = list(string) default = ["10.4.0.0/16","10.5.0.0/16","10.6.0.0/16"] } variable "cluster_addons" { type = list(object({ name = string config = string })) default = [ { "name" = "terway-eniip", # terway 默认模式 #"name" = "terway-eni", # terway eni独享模式 "config" = "", }, { "name" = "csi-plugin", "config" = "", }, { "name" = "csi-provisioner", "config" = "", }, { "name" = "alicloud-disk-controller", "config" = "", }, { "name" = "logtail-ds", "config" = "{\"IngressDashboardEnabled\":\"true\"}", }, { "name" = "nginx-ingress-controller", "config" = "{\"IngressSlbNetworkType\":\"internet\"}", }, { "name" = "arms-prometheus", "config" = "", }, { "name" = "ack-node-problem-detector", "config" = "", }, { "name" = "ack-kubernetes-cronhpa-controller", "config" = "", }, { "name" = "ack-node-local-dns", "config" = "", } ] } variable "kube_config" { description = "kubeconfig path since 1.105.0" default = "~/.kube/config-terraform" }
- 配置阿里云cloud provider,并初始化项目
export ALICLOUD_ACCESS_KEY="xxxxxx" export ALICLOUD_SECRET_KEY="xxxxxx" export ALICLOUD_REGION="cn-hangzhou" #init phase terraform init #Planning phase terraform plan #Apply phase terraform apply
- 删除创建云资源
#Destroy terraform destroy
创建ECI虚拟节点
前提条件:
安装步骤:
- 登录ACK控制台,点击集群名称进入该集群
- 在“运维管理”中,进入"组件管理"
- 找到"ack-virtual-node"组件,点击"安装"
- 安装成功后,在集群“节点管理”中的节点列表中,可以看到多出现一个virtual-kubelet这样一个虚拟节点
给ingress Controller的私网SLB绑定一个公网eip
因为实验条件所限,前面我们没有直接给nginx ingress controller自动创建一个公网slb作为入口,但为了后续实验可以顺利进行,我们需要到slb的控制台上手动给nginx ingress controller的私网slb绑定一个eip,(如体验用户账户中有超过100元余额,可直接购买公网slb用于实验)
- 给ingress controller对应的slb关闭修改保护
- 在slb控制台上给该slb实例绑定eip
部署一个简单的应用(可选)
该示例中,将采用yaml部署一个deploy,并通过一个loadbalancer类型的service对外暴露服务,另外,该示例中,我们通过ECI profile将集群中不可调度的节点调度至ECI虚拟节点上
cat << EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: labels: app: nginx name: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx annotations: alibabacloud.com/burst-resource: eci #选择弹性调度的资源类型,当前集群ECS资源不足时,使用ECI弹性资源。 spec: containers: - image: 'registry-vpc.cn-beijing.aliyuncs.com/haoshuwei/nginx:latest' imagePullPolicy: Always name: nginx resources: limits: cpu: '4' memory: 8Gi requests: cpu: '4' memory: 8Gi --- apiVersion: v1 kind: Service metadata: name: nginx-svc namespace: default spec: ports: - name: '80' port: 80 protocol: TCP targetPort: 80 selector: app: nginx sessionAffinity: None type: ClusterIP --- apiVersion: extensions/v1beta1 kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/service-weight: '' name: nginx-ingress namespace: default spec: rules: - host: alibj.workshop.com http: paths: - backend: serviceName: nginx-svc servicePort: 80 path: / pathType: ImplementationSpecific EOF
验证:
- 使用kubectl命令来查看当前pod的调度情况,可以看到由于集群中资源不足以调度请求规格的pod,自动创建出了ECI的pod,可以看到pod位于虚拟节点上
kubectl get po -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-7797c877f-cfxjh 1/1 Running 0 74s 10.1.126.225 virtual-kubelet-cn-hangzhou-h <none> <none> nginx-7797c877f-rjk4v 1/1 Running 0 74s 10.1.126.224 virtual-kubelet-cn-hangzhou-h <none> <none>
2. 登录到一台ECS worker节点上,用curl命令看下是否可以访问刚刚部署的nginx服务
curl -H "Host: alibj.workshop.com" http://ingress-controller-eip/
清理:
使用kubectl来删除上面的示例
kubectl delete deploy nginx kubectl delete service nginx-svc kubectl delete ingress nginx-ingress
小结:
至此,我们快速的完成了一个ACK集群的创建,同时为该集群添加了一个具有海量弹性扩展能力的ECI虚拟节点。