Kubernetes部署-RKE自动化部署

2024-01-31 17:08:52

一、简介

RKE：Rancher Kubernetes Engine
一个极其简单，闪电般快速的Kubernetes安装程序，可在任何地方使用。

二、准备工作

I、配置系统

系统：CentOS 7 / Ubuntu
配置完系统后安装必要的软件：

yum install lvm2 parted lrzsz -y
# 查看需要配置的磁盘
fdisk -l
# 如：/dev/sda
fdisk /dev/sda # 根据提示进行分区
# 配置lvm卷
pvcreate /dev/sda1
vgcreate disk1 /dev/sda1
lvcreate -n data -l +100%FREE disk1
# 格式化磁盘
mkfs.xfs /dev/disk1/data
# 写入开机自动加载
diskuuid=`blkid /dev/disk1/data | awk '{print $2}' | tr -d '"'`
echo "$diskuuid /data                   xfs     defaults        0 0" >> /etc/fstab
# 判断/data目录是否存在并挂载磁盘
[ -d /data ] || mkdir /data
mount -a

II、安装docker

可以根据rancher提供的匹配版本进行安装：

DOCKER VERSION	INSTALL SCRIPT
18.09.2	`curl [https://releases.rancher.com/install-docker/18.09.2.sh](https://releases.rancher.com/install-docker/18.09.2.sh) \| sh`
18.06.2	`curl [https://releases.rancher.com/install-docker/18.06.2.sh](https://releases.rancher.com/install-docker/18.06.2.sh) \| sh`
17.03.2	`curl [https://releases.rancher.com/install-docker/17.03.2.sh](https://releases.rancher.com/install-docker/17.03.2.sh) \| sh`

也可以通过如下命令进行安装：

# 配置yum源
sudo yum remove docker docker-common docker-selinux docker-engine
wget -O /etc/yum.repos.d/docker-ce.repo https://download.docker.com/linux/centos/docker-ce.repo
sudo sed -i 's+download.docker.com+mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo
yum install -y -q docker-ce-18.09.2 # 此处指定你要安装的版本即可

配置docker的daemon.json文件：

systemctl enable docker
systemctl start docker
echo '''{
  "data-root": "/data/docker",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "3"
  },
  "registry-mirrors": [
    "https://hub-mirror.c.163.com",
    "https://docker.mirrors.ustc.edu.cn",
    "https://dockerhub.azk8s.cn"
  ]
}''' > /etc/docker/daemon.json
[ -d /data/docker ] || mkdir /data/docker
systemctl restart docker

至此，基本环境已经配置完成。

三、安装配置

假设初始化环境如下：

IP	系统
192.168.0.1	CentOS 7
192.168.0.2	CentOS 7
192.168.0.3	CentOS 7

首先需要配置进行集群创建的用户以及免密登陆：

useradd admin
usermod -aG docker admin
su - admin
cd .ssh/
ssh-keygen -t rsa # 一路回车完成配置
echo <PublicKeys> >> /home/admin/.ssh/authorized_keys

配置完免密登陆，我们需要下载rke软件并进行集群配置文件设置：

# 下载rke软件
# github地址：https://github.com/rancher/rke
wget https://github.com/rancher/rke/releases/download/v1.0.4/rke_linux-amd64
ln -s rke_linux-amd64 /usr/local/bin/rke
# 配置rke配置文件
[ -d /data/k8s ] || mkdir /data/k8s ; cd /data/k8s
rke config --name cluster.yml # 按照提示进行rke配置
# 完整配置后
rke up # 等待安装完成

rke cluster.yml的示例：
官方示例一：

nodes:
    - address: 1.1.1.1
      user: ubuntu
      role:
        - controlplane
        - etcd
      ssh_key_path: /home/user/.ssh/id_rsa
      port: 2222
    - address: 2.2.2.2
      user: ubuntu
      role:
        - worker
      ssh_key: |-
        -----BEGIN RSA PRIVATE KEY-----

        -----END RSA PRIVATE KEY-----
    - address: example.com
      user: ubuntu
      role:
        - worker
      hostname_override: node3
      internal_address: 192.168.1.6
      labels:
        app: ingress

# If set to true, RKE will not fail when unsupported Docker version
# are found
ignore_docker_version: false

# Cluster level SSH private key
# Used if no ssh information is set for the node
ssh_key_path: ~/.ssh/test

# Enable use of SSH agent to use SSH private keys with passphrase
# This requires the environment `SSH_AUTH_SOCK` configured pointing
#to your SSH agent which has the private key added
ssh_agent_auth: true

# List of registry credentials
# If you are using a Docker Hub registry, you can omit the `url`
# or set it to `docker.io`
# is_default set to `true` will override the system default
# registry set in the global settings
private_registries:
     - url: registry.com
       user: Username
       password: password
       is_default: true

# Bastion/Jump host configuration
bastion_host:
    address: x.x.x.x
    user: ubuntu
    port: 22
    ssh_key_path: /home/user/.ssh/bastion_rsa
# or
#   ssh_key: |-
#     -----BEGIN RSA PRIVATE KEY-----
#
#     -----END RSA PRIVATE KEY-----

# Set the name of the Kubernetes cluster  
cluster_name: mycluster


# The Kubernetes version used. The default versions of Kubernetes
# are tied to specific versions of the system images.
#
# For RKE v0.2.x and below, the map of Kubernetes versions and their system images is
# located here:
# https://github.com/rancher/types/blob/release/v2.2/apis/management.cattle.io/v3/k8s_defaults.go
#
# For RKE v0.3.0 and above, the map of Kubernetes versions and their system images is
# located here:
# https://github.com/rancher/kontainer-driver-metadata/blob/master/rke/k8s_rke_system_images.go
#
# In case the kubernetes_version and kubernetes image in
# system_images are defined, the system_images configuration
# will take precedence over kubernetes_version.
kubernetes_version: v1.10.3-rancher2

# System Images are defaulted to a tag that is mapped to a specific
# Kubernetes Version and not required in a cluster.yml. 
# Each individual system image can be specified if you want to use a different tag.
#
# For RKE v0.2.x and below, the map of Kubernetes versions and their system images is
# located here:
# https://github.com/rancher/types/blob/release/v2.2/apis/management.cattle.io/v3/k8s_defaults.go
#
# For RKE v0.3.0 and above, the map of Kubernetes versions and their system images is
# located here:
# https://github.com/rancher/kontainer-driver-metadata/blob/master/rke/k8s_rke_system_images.go
#
system_images:
    kubernetes: rancher/hyperkube:v1.10.3-rancher2
    etcd: rancher/coreos-etcd:v3.1.12
    alpine: rancher/rke-tools:v0.1.9
    nginx_proxy: rancher/rke-tools:v0.1.9
    cert_downloader: rancher/rke-tools:v0.1.9
    kubernetes_services_sidecar: rancher/rke-tools:v0.1.9
    kubedns: rancher/k8s-dns-kube-dns-amd64:1.14.8
    dnsmasq: rancher/k8s-dns-dnsmasq-nanny-amd64:1.14.8
    kubedns_sidecar: rancher/k8s-dns-sidecar-amd64:1.14.8
    kubedns_autoscaler: rancher/cluster-proportional-autoscaler-amd64:1.0.0
    pod_infra_container: rancher/pause-amd64:3.1

services:
    etcd:
      # if external etcd is used
      # path: /etcdcluster
      # external_urls:
      #   - https://etcd-example.com:2379
      # ca_cert: |-
      #   -----BEGIN CERTIFICATE-----
      #   xxxxxxxxxx
      #   -----END CERTIFICATE-----
      # cert: |-
      #   -----BEGIN CERTIFICATE-----
      #   xxxxxxxxxx
      #   -----END CERTIFICATE-----
      # key: |-
      #   -----BEGIN PRIVATE KEY-----
      #   xxxxxxxxxx
      #   -----END PRIVATE KEY-----
    # Note for Rancher v2.0.5 and v2.0.6 users: If you are configuring
    # Cluster Options using a Config File when creating Rancher Launched
    # Kubernetes, the names of services should contain underscores
    # only: `kube_api`.
    kube-api:
      # IP range for any services created on Kubernetes
      # This must match the service_cluster_ip_range in kube-controller
      service_cluster_ip_range: 10.43.0.0/16
      # Expose a different port range for NodePort services
      service_node_port_range: 30000-32767    
      pod_security_policy: false
      # Add additional arguments to the kubernetes API server
      # This WILL OVERRIDE any existing defaults
      extra_args:
        # Enable audit log to stdout
        audit-log-path: "-"
        # Increase number of delete workers
        delete-collection-workers: 3
        # Set the level of log output to debug-level
        v: 4
    # Note for Rancher 2 users: If you are configuring Cluster Options
    # using a Config File when creating Rancher Launched Kubernetes,
    # the names of services should contain underscores only:
    # `kube_controller`. This only applies to Rancher v2.0.5 and v2.0.6.
    kube-controller:
      # CIDR pool used to assign IP addresses to pods in the cluster
      cluster_cidr: 10.42.0.0/16
      # IP range for any services created on Kubernetes
      # This must match the service_cluster_ip_range in kube-api
      service_cluster_ip_range: 10.43.0.0/16
    kubelet:
      # Base domain for the cluster
      cluster_domain: cluster.local
      # IP address for the DNS service endpoint
      cluster_dns_server: 10.43.0.10
      # Fail if swap is on
      fail_swap_on: false
      # Set max pods to 250 instead of default 110
      extra_args:
        max-pods: 250
      # Optionally define additional volume binds to a service
      extra_binds:
        - "/usr/libexec/kubernetes/kubelet-plugins:/usr/libexec/kubernetes/kubelet-plugins"

# Currently, only authentication strategy supported is x509.
# You can optionally create additional SANs (hostnames or IPs) to
# add to the API server PKI certificate.
# This is useful if you want to use a load balancer for the
# control plane servers.
authentication:
    strategy: x509
    sans:
      - "10.18.160.10"
      - "my-loadbalancer-1234567890.us-west-2.elb.amazonaws.com"

# Kubernetes Authorization mode
# Use `mode: rbac` to enable RBAC
# Use `mode: none` to disable authorization
authorization:
    mode: rbac

# If you want to set a Kubernetes cloud provider, you specify
# the name and configuration
cloud_provider:
    name: aws

# Add-ons are deployed using kubernetes jobs. RKE will give
# up on trying to get the job status after this timeout in seconds..
addon_job_timeout: 30

# Specify network plugin-in (canal, calico, flannel, weave, or none)
network:
    plugin: canal

# Specify DNS provider (coredns or kube-dns)
dns:
    provider: coredns

# Currently only nginx ingress provider is supported.
# To disable ingress controller, set `provider: none`
# `node_selector` controls ingress placement and is optional
ingress:
    provider: nginx
    node_selector:
      app: ingress
      
# All add-on manifests MUST specify a namespace
addons: |-
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: my-nginx
      namespace: default
    spec:
      containers:
      - name: my-nginx
        image: nginx
        ports:
        - containerPort: 80

addons_include:
    - https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/rook-operator.yaml
    - https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/rook-cluster.yaml
    - /path/to/manifest

官方示例二：（主要是为了着重展示扩展参数！）

nodes:
  - address: 1.1.1.1
    internal_address:
    user: ubuntu
    role:
      - controlplane
      - etcd
    ssh_key_path: /home/user/.ssh/id_rsa
    port: 2222
  - address: 2.2.2.2
    internal_address:
    user: ubuntu
    role:
      - worker
    ssh_key: |-
      -----BEGIN RSA PRIVATE KEY-----
      -----END RSA PRIVATE KEY-----
  - address: example.com
    internal_address:
    user: ubuntu
    role:
      - worker
    hostname_override: node3
    internal_address: 192.168.1.6
    labels:
      app: ingress
      app: dns

# 如果设置为true，则可以使用不受支持的Docker版本
ignore_docker_version: false

# 集群等级的SSH私钥(private key)
## 如果节点未配置SSH私钥，RKE将会以此私钥去连接集群节点
ssh_key_path: ~/.ssh/test

# 使用SSH agent来提供SSH私钥
## 需要配置环境变量`SSH_AUTH_SOCK`指向已添加私钥的SSH agent
ssh_agent_auth: false

# 配置docker root目录
docker_root_dir: "/var/lib/docker"

# 私有仓库
## 当设置`is_default: true`后，构建集群时会自动在配置的私有仓库中拉取镜像
## 如果使用的是DockerHub镜像仓库，则可以省略`url`或将其设置为`docker.io`
## 如果使用内部公开仓库，则可以不用设置用户名和密码

private_registries:
  - url: registry.com
    user: Username
    password: password
    is_default: true

# 堡垒机
## 如果集群节点需要通过堡垒机跳转，那么需要为RKE配置堡垒机信息
bastion_host:
  address: x.x.x.x
  user: ubuntu
  port: 22
  ssh_key_path: /home/user/.ssh/bastion_rsa
# or
#   ssh_key: |-
#     -----BEGIN RSA PRIVATE KEY-----
#
#     -----END RSA PRIVATE KEY-----

# 设置Kubernetes集群名称

# 定义kubernetes版本.
## 目前, 版本定义需要与rancher/types defaults map相匹配: https://github.com/rancher/types/blob/master/apis/management.cattle.io/v3/k8s_defaults.go#L14 （后期版本请查看: https://github.com/rancher/kontainer-driver-metadata/blob/master/rke/k8s_rke_system_images.go ）
## 如果同时定义了kubernetes_version和system_images中的kubernetes镜像，则system_images配置将优先于kubernetes_version
kubernetes_version: v1.14.3-rancher1

# `system_images`优先级更高，如果没有单独指定`system_images`镜像，则会使用`kubernetes_version`对应的默认镜像版本。
## 默认Tags: https://github.com/rancher/types/blob/master/apis/management.cattle.io/v3/k8s_defaults.go)(Rancher v2.3或者RKE v0.3之后的版本请查看: https://github.com/rancher/kontainer-driver-metadata/blob/master/rke/k8s_rke_system_images.go ）
system_images:
  etcd: rancher/coreos-etcd:v3.3.10-rancher1
  alpine: rancher/rke-tools:v0.1.34
  nginx_proxy: rancher/rke-tools:v0.1.34
  cert_downloader: rancher/rke-tools:v0.1.34
  kubernetes_services_sidecar: rancher/rke-tools:v0.1.34
  kubedns: rancher/k8s-dns-kube-dns:1.15.0
  dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
  kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
  kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.3.0
  coredns: rancher/coredns-coredns:1.3.1
  coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.3.0
  kubernetes: rancher/hyperkube:v1.14.3-rancher1
  flannel: rancher/coreos-flannel:v0.10.0-rancher1
  flannel_cni: rancher/flannel-cni:v0.3.0-rancher1
  calico_node: rancher/calico-node:v3.4.0
  calico_cni: rancher/calico-cni:v3.4.0
  calico_controllers: ""
  calico_ctl: rancher/calico-ctl:v2.0.0
  canal_node: rancher/calico-node:v3.4.0
  canal_cni: rancher/calico-cni:v3.4.0
  canal_flannel: rancher/coreos-flannel:v0.10.0
  weave_node: weaveworks/weave-kube:2.5.0
  weave_cni: weaveworks/weave-npc:2.5.0
  pod_infra_container: rancher/pause:3.1
  ingress: rancher/nginx-ingress-controller:0.21.0-rancher3
  ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
  metrics_server: rancher/metrics-server:v0.3.1

services:
  etcd:
    # if external etcd is used
    # path: /etcdcluster
    # external_urls:
    #   - https://etcd-example.com:2379
    # ca_cert: |-
    #   -----BEGIN CERTIFICATE-----
    #   xxxxxxxxxx
    #   -----END CERTIFICATE-----
    # cert: |-
    #   -----BEGIN CERTIFICATE-----
    #   xxxxxxxxxx
    #   -----END CERTIFICATE-----
    # key: |-
    #   -----BEGIN PRIVATE KEY-----
    #   xxxxxxxxxx
    #   -----END PRIVATE KEY-----
    # Rancher 2用户注意事项：如果在创建Rancher Launched Kubernetes时使用配置文件配置集群，则`kube_api`服务名称应仅包含下划线。这仅适用于Rancher v2.0.5和v2.0.6。
    # 以下参数仅支持RKE部署的etcd集群

    # 开启自动备份
    ## rke版本小于0.2.x或rancher版本小于v2.2.0时使用
    snapshot: true
    creation: 5m0s
    retention: 24h
    ## rke版本大于等于0.2.x或rancher版本大于等于v2.2.0时使用(两段配置二选一)
    backup_config:
      enabled: true           # 设置true启用ETCD自动备份，设置false禁用；
      interval_hours: 12      # 快照创建间隔时间，不加此参数，默认5分钟；
      retention: 6            # etcd备份保留份数；
      # S3配置选项
      s3backupconfig:
        access_key: "myaccesskey"
        secret_key:  "myaccesssecret"
        bucket_name: "my-backup-bucket"
        folder: "folder-name" # 此参数v2.3.0之后可用
        endpoint: "s3.eu-west-1.amazonaws.com"
        region: "eu-west-1"
    # 扩展参数
    extra_args:
      auto-compaction-retention: 240 #(单位小时)
      # 修改空间配额为$((6*1024*1024*1024))，默认2G,最大8G
      quota-backend-bytes: '6442450944'
  kube-api:
    # cluster_ip范围
    ## 这必须与kube-controller中的service_cluster_ip_range匹配
    service_cluster_ip_range: 10.43.0.0/16
    # NodePort映射的端口范围
    service_node_port_range: 30000-32767
    # Pod安全策略
    pod_security_policy: false
    # kubernetes API server扩展参数
    ## 这些参数将会替换默认值
    extra_args:
      watch-cache: true
      default-watch-cache-size: 1500
      # 事件保留时间，默认1小时
      event-ttl: 1h0m0s
      # 默认值400，设置0为不限制，一般来说，每25~30个Pod有15个并行
      max-requests-inflight: 800
      # 默认值200，设置0为不限制
      max-mutating-requests-inflight: 400
      # kubelet操作超时，默认5s
      kubelet-timeout: 5s
      # 启用审计日志到标准输出
      audit-log-path: "-"
      # 增加删除workers的数量
      delete-collection-workers: 3
      # 将日志输出的级别设置为debug模式
      v: 4
  # Rancher 2用户注意事项：如果在创建Rancher Launched Kubernetes时使用配置文件配置集群，则`kube_controller`服务名称应仅包含下划线。这仅适用于Rancher v2.0.5和v2.0.6。
  kube-controller:
    # Pods_ip范围
    cluster_cidr: 10.42.0.0/16
    # cluster_ip范围
    ## 这必须与kube-api中的service_cluster_ip_range相同
    service_cluster_ip_range: 10.43.0.0/16
    extra_args:
      # 修改每个节点子网大小(cidr掩码长度)，默认为24，可用IP为254个；23，可用IP为510个；22，可用IP为1022个；
      node-cidr-mask-size: '24'

      feature-gates: "TaintBasedEvictions=false"
      # 控制器定时与节点通信以检查通信是否正常，周期默认5s
      node-monitor-period: '5s'
      ## 当节点通信失败后，再等一段时间kubernetes判定节点为notready状态。
      ## 这个时间段必须是kubelet的nodeStatusUpdateFrequency(默认10s)的整数倍，
      ## 其中N表示允许kubelet同步节点状态的重试次数，默认40s。
      node-monitor-grace-period: '20s'
      ## 再持续通信失败一段时间后，kubernetes判定节点为unhealthy状态，默认1m0s。
      node-startup-grace-period: '30s'
      ## 再持续失联一段时间，kubernetes开始迁移失联节点的Pod，默认5m0s。
      pod-eviction-timeout: '1m'

      # 默认5. 同时同步的deployment的数量。
      concurrent-deployment-syncs: 5
      # 默认5. 同时同步的endpoint的数量。
      concurrent-endpoint-syncs: 5
      # 默认20. 同时同步的垃圾收集器工作器的数量。
      concurrent-gc-syncs: 20
      # 默认10. 同时同步的命名空间的数量。
      concurrent-namespace-syncs: 10
      # 默认5. 同时同步的副本集的数量。
      concurrent-replicaset-syncs: 5
      # 默认5m0s. 同时同步的资源配额数。（新版本中已弃用）
      # concurrent-resource-quota-syncs: 5m0s
      # 默认1. 同时同步的服务数。
      concurrent-service-syncs: 1
      # 默认5. 同时同步的服务帐户令牌数。
      concurrent-serviceaccount-token-syncs: 5
      # 默认5. 同时同步的复制控制器的数量
      concurrent-rc-syncs: 5
      # 默认30s. 同步deployment的周期。
      deployment-controller-sync-period: 30s
      # 默认15s。同步PV和PVC的周期。
      pvclaimbinder-sync-period: 15s
  kubelet:
    # 集群搜索域
    cluster_domain: cluster.local
    # 内部DNS服务器地址
    cluster_dns_server: 10.43.0.10
    # 禁用swap
    fail_swap_on: false
    # 扩展变量
    extra_args:
      # 支持静态Pod。在主机/etc/kubernetes/目录下创建manifest目录，Pod YAML文件放在/etc/kubernetes/manifest/目录下
      pod-manifest-path: "/etc/kubernetes/manifest/"
      root-dir:  "/var/lib/kubelet"
      docker-root: "/var/lib/docker"
      feature-gates: "TaintBasedEvictions=false"
      # 指定pause镜像
      pod-infra-container-image: 'rancher/pause:3.1'
      # 传递给网络插件的MTU值，以覆盖默认值，设置为0(零)则使用默认的1460
      network-plugin-mtu: '1500'
      # 修改节点最大Pod数量
      max-pods: "250"
      # 密文和配置映射同步时间，默认1分钟
      sync-frequency: '3s'
      # Kubelet进程可以打开的文件数（默认1000000）,根据节点配置情况调整
      max-open-files: '2000000'
      # 与apiserver会话时的并发数，默认是10
      kube-api-burst: '30'
      # 与apiserver会话时的 QPS,默认是5，QPS = 并发量/平均响应时间
      kube-api-qps: '15'
      # kubelet默认一次拉取一个镜像，设置为false可以同时拉取多个镜像，
      # 前提是存储驱动要为overlay2，对应的Dokcer也需要增加下载并发数，参考[docker配置](/rancher2x/install-prepare/best-practices/docker/)
      serialize-image-pulls: 'false'
      # 拉取镜像的最大并发数，registry-burst不能超过registry-qps ，
      # 仅当registry-qps大于0(零)时生效，(默认10)。如果registry-qps为0则不限制(默认5)。
      registry-burst: '10'
      registry-qps: '0'
      cgroups-per-qos: 'true'
      cgroup-driver: 'cgroupfs'

      # 节点资源预留
      enforce-node-allocatable: 'pods'
      system-reserved: 'cpu=0.25,memory=200Mi'
      kube-reserved: 'cpu=0.25,memory=1500Mi'
      # POD驱逐，这个参数只支持内存和磁盘。
      ## 硬驱逐阈值
      ### 当节点上的可用资源降至保留值以下时，就会触发强制驱逐。强制驱逐会强制kill掉POD，不会等POD自动退出。
      eviction-hard: 'memory.available<300Mi,nodefs.available<10%,imagefs.available<15%,nodefs.inodesFree<5%'
      ## 软驱逐阈值
      ### 以下四个参数配套使用，当节点上的可用资源少于这个值时但大于硬驱逐阈值时候，会等待eviction-soft-grace-period设置的时长；
      ### 等待中每10s检查一次，当最后一次检查还触发了软驱逐阈值就会开始驱逐，驱逐不会直接Kill POD，先发送停止信号给POD，然后等待eviction-max-pod-grace-period设置的时长；
      ### 在eviction-max-pod-grace-period时长之后，如果POD还未退出则发送强制kill POD"
      eviction-soft: 'memory.available<500Mi,nodefs.available<50%,imagefs.available<50%,nodefs.inodesFree<10%'
      eviction-soft-grace-period: 'memory.available=1m30s'
      eviction-max-pod-grace-period: '30'
      eviction-pressure-transition-period: '30s'
      # 指定kubelet多长时间向master发布一次节点状态。注意: 它必须与kube-controller中的nodeMonitorGracePeriod一起协调工作。(默认 10s)
      node-status-update-frequency: 10s
      # 设置cAdvisor全局的采集行为的时间间隔，主要通过内核事件来发现新容器的产生。默认1m0s
      global-housekeeping-interval: 1m0s
      # 每个已发现的容器的数据采集频率。默认10s
      housekeeping-interval: 10s
      # 所有运行时请求的超时，除了长时间运行的 pull, logs, exec and attach。超时后，kubelet将取消请求，抛出错误，然后重试。(默认2m0s)
      runtime-request-timeout: 2m0s
      # 指定kubelet计算和缓存所有pod和卷的卷磁盘使用量的间隔。默认为1m0s
      volume-stats-agg-period: 1m0s

    # 可以选择定义额外的卷绑定到服务
    extra_binds:
      - "/usr/libexec/kubernetes/kubelet-plugins:/usr/libexec/kubernetes/kubelet-plugins"
      - "/etc/iscsi:/etc/iscsi"
      - "/sbin/iscsiadm:/sbin/iscsiadm"
  kubeproxy:
    extra_args:
      # 默认使用iptables进行数据转发，如果要启用ipvs，则此处设置为`ipvs`
      proxy-mode: ""
      # 与kubernetes apiserver通信并发数,默认10
      kube-api-burst: 20
      # 与kubernetes apiserver通信时使用QPS，默认值5，QPS=并发量/平均响应时间
      kube-api-qps: 10
    extra_binds:
  scheduler:
    extra_args: {}
    extra_binds: []
    extra_env: []

# 目前，只支持x509验证
## 您可以选择创建额外的SAN(主机名或IP)以添加到API服务器PKI证书。
## 如果要为control plane servers使用负载均衡器，这很有用。
authentication:
  strategy: "x509|webhook"
  webhook:
    config_file: "...."
    cache_timeout: 5s
  sans:
    # 此处配置备用域名或IP，当主域名或者IP无法访问时，可通过备用域名或IP访问
    - "192.168.1.100"
    - "www.test.com"
# Kubernetes认证模式
## Use `mode: rbac` 启用 RBAC
## Use `mode: none` 禁用 认证
authorization:
  mode: rbac
# 如果要设置Kubernetes云提供商，需要指定名称和配置，非云主机则留空；
cloud_provider:
# Add-ons是通过kubernetes jobs来部署。 在超时后，RKE将放弃重试获取job状态。以秒为单位。
addon_job_timeout: 30
# 有几个网络插件可以选择：`flannel、canal、calico`，Rancher2默认canal
network:
  # rke v1.0.4+ 可用，如果选择canal网络驱动，需要设置mtu为1450
  mtu: 1450  
  plugin: canal
  options:
    flannel_backend_type: "vxlan"
# 目前只支持nginx ingress controller
## 可以设置`provider: none`来禁用ingress controller
ingress:
  provider: nginx
  node_selector:
    app: ingress
# 配置dns上游dns服务器
## 可用rke版本 v0.2.0
dns:
  provider: coredns
  upstreamnameservers:
  - 114.114.114.114
  - 1.2.4.8
  node_selector:
    app: dns
# 安装附加应用
## 所有附加应用都必须指定命名空间
addons: |-
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      namespace: default
    spec:
      containers:
        image: nginx
        ports:
        - containerPort: 80

addons_include:
    - https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/rook-operator.yml
    - https://raw.githubusercontent.com/rook/rook/master/cluster/examples/kubernetes/rook-cluster.yml
    - /path/to/manifest

个人示例：

nodes:
- address: 192.168.0.1
  port: "22"
  internal_address: 192.168.0.1
  role:
  - controlplane
  - etcd
  - worker
  hostname_override: 192.168.0.1
  user: admin
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 192.168.0.2
  port: "22"
  internal_address: 192.168.0.2
  role:
  - controlplane
  - etcd
  - worker
  hostname_override: 192.168.0.2
  user: admin
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
- address: 192.168.0.3
  port: "22"
  internal_address: 192.168.0.3
  role:
  - controlplane
  - etcd
  - worker
  hostname_override: 192.168.0.3
  user: admin
  docker_socket: /var/run/docker.sock
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
  labels: {}
  taints: []
services:
  etcd:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    external_urls: []
    ca_cert: ""
    cert: ""
    key: ""
    path: ""
    uid: 0
    gid: 0
    snapshot: null
    retention: ""
    creation: ""
    backup_config: null
  kube-api:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
    service_cluster_ip_range: 172.26.96.0/20
    service_node_port_range: "30000-40000"
    pod_security_policy: false
    always_pull_images: false
    secrets_encryption_config: null
    audit_log: null
    admission_configuration: null
    event_rate_limit: null
  kube-controller:
    image: ""
    extra_args: {}
    extra_args:
      # 修改每个节点子网大小(cidr掩码长度)，默认为24，可用IP为254个；23，可用IP为510个；22，可用IP为1022个；
      node-cidr-mask-size: '25'
    extra_binds: []
    extra_env: []
    cluster_cidr: 172.26.112.0/20
    service_cluster_ip_range: 172.26.96.0/20
  scheduler:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
  kubelet:
    image: ""
    extra_args:
      # 修改节点最大Pod数量
      max-pods: "120"
    extra_binds: []
    extra_env: []
    cluster_domain: cluster.local
    infra_container_image: ""
    cluster_dns_server: 172.26.96.10
    fail_swap_on: false
    generate_serving_certificate: false
  kubeproxy:
    image: ""
    extra_args: {}
    extra_binds: []
    extra_env: []
network:
  plugin: flannel
  options: {}
  mtu: 0
  node_selector: {}
authentication:
  strategy: x509
  sans: []
  webhook: null
# All add-on manifests MUST specify a namespace
addons: |-
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      labels:
        app: flannel
        tier: node
      name: kube-flannel-cfg
      namespace: kube-system
    data:
      cni-conf.json: |
        {
          "name": "cbr0",
          "cniVersion":"0.3.1",
          "plugins": [
            {
              "type": "flannel",
              "delegate": {
                "hairpinMode": true,
                "isDefaultGateway": true
              }
            },
            {
              "type": "portmap",
              "capabilities": {
                "portMappings": true
              }
            }
          ]
        }
      net-conf.json: |
        {
          "Network": "172.26.112.0/20",
          "Backend": {
            "Type": "vxlan",
            "VNI": 1,
            "Port": 8472
          }
        }
addons_include: []
system_images:
  etcd: rancher/coreos-etcd:v3.4.3-rancher1
  alpine: rancher/rke-tools:v0.1.52
  nginx_proxy: rancher/rke-tools:v0.1.52
  cert_downloader: rancher/rke-tools:v0.1.52
  kubernetes_services_sidecar: rancher/rke-tools:v0.1.52
  kubedns: rancher/k8s-dns-kube-dns:1.15.0
  dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
  kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
  kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
  coredns: rancher/coredns-coredns:1.6.5
  coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.7.1
  kubernetes: rancher/hyperkube:v1.17.2-rancher1
  flannel: rancher/coreos-flannel:v0.11.0-rancher1
  flannel_cni: rancher/flannel-cni:v0.3.0-rancher5
  calico_node: rancher/calico-node:v3.10.2
  calico_cni: rancher/calico-cni:v3.10.2
  calico_controllers: rancher/calico-kube-controllers:v3.10.2
  calico_ctl: rancher/calico-ctl:v2.0.0
  calico_flexvol: rancher/calico-pod2daemon-flexvol:v3.10.2
  canal_node: rancher/calico-node:v3.10.2
  canal_cni: rancher/calico-cni:v3.10.2
  canal_flannel: rancher/coreos-flannel:v0.11.0
  canal_flexvol: rancher/calico-pod2daemon-flexvol:v3.10.2
  weave_node: weaveworks/weave-kube:2.5.2
  weave_cni: weaveworks/weave-npc:2.5.2
  pod_infra_container: rancher/pause:3.1
  ingress: rancher/nginx-ingress-controller:nginx-0.25.1-rancher1
  ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
  metrics_server: rancher/metrics-server:v0.3.6
  windows_pod_infra_container: rancher/kubelet-pause:v0.1.3
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
  mode: rbac
  options: {}
ignore_docker_version: true
kubernetes_version: ""
private_registries: []
ingress:
  provider: ""
  options: {}
  node_selector: {}
  extra_args: {}
  dns_policy: ""
  extra_envs: []
  extra_volumes: []
  extra_volume_mounts: []
cluster_name: ""
cloud_provider:
  name: ""
prefix_path: ""
addon_job_timeout: 0
bastion_host:
  address: ""
  port: ""
  user: ""
  ssh_key: ""
  ssh_key_path: ""
  ssh_cert: ""
  ssh_cert_path: ""
monitoring:
  provider: ""
  options: {}
  node_selector: {}
restore:
  restore: false
  snapshot_name: ""
dns: null

OK，到目前为止集群已经安装完成。

重要说明

以下文件需要维护，故障排除和升级群集。

将以下文件的副本保存在安全的位置：

cluster.yml：RKE集群配置文件。
kube_config_cluster.yml：集群的Kubeconfig文件，此文件包含用于完全访问集群的凭据。
cluster.rkestate：Kubernetes群集状态文件，此文件包含用于完全访问群集的凭据。

仅在使用RKE v0.2.0或更高版本时创建Kubernetes群集状态文件。

RKE官方文档

码农公寓