1. 前言
kubernetes 默认会将容器的stdout和stderr录入node(minion)的/var/log/containers目录下,而kubernetes 组件的日志默认放置在/var/log目录下。
如果你是用kube-up启动的kubernetes集群,那么恭喜你,你可 以方便的启动k8s的日志功能。参考:http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/ 或者在k8s安装包的解压目录kubernetes/cluster/addons/fluentd-elasticsearch中找到安装文件。
如果你是通过命令或者脚本启动的k8s集群,那么也恭喜你,折腾等着你。
2. 实现需求
将各个pod中容器的stdout和stderr中的日志集中展示。
3. 部署结构
待传。
4. Elasticsearch & Kibana
这两个使用官网的镜像。Elasticsearch如果需要集群化,参考:
https://hub.docker.com/r/fabric8/elasticsearch/tags/
https://github.com/fabric8io/elasticsearch-cloud-kubernetes
由于本人的k8s环境并不通外网,需要先从官网下载镜像,再打上私有仓库的标签。
docker pull elasticsearch:2.3
docker pull kibana:4.5
docker tag elasticsearch:2.3 10.10.50.161:5000/elasticsearch:2.3
docker tag kibana:4.5 10.10.50.161:5000/kibana:4.5
docker push 10.10.50.161:5000/elasticsearch:2.3
docker push 10.10.50.161:5000/kibana:4.5
生成rc和svc:
kubectl create -f elasticsearch-kibana-svc.yaml
kubectl create -f elasticsearch-kibana-rc.yaml
elasticsearch-kibana-rc.yaml:
apiVersion: v1
kind: ReplicationController
metadata:
name: elasticsearch-kibana
namespace: kube-system
labels:
k8s-app: elasticsearch-kibana
version: v1
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: elasticsearch-kibana
version: v1
template:
metadata:
labels:
k8s-app: elasticsearch-kibana
version: v1
kubernetes.io/cluster-service: "true"
spec:
containers:
- image: 10.10.50.161:5000/elasticsearch:2.3
name: elasticsearch
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: http
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: es-persistent-storage
mountPath: /usr/share/elasticsearch/data
- image: 10.10.50.161:5000/kibana:4.5
name: kibana
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
requests:
cpu: 100m
ports:
- containerPort: 5601
name: ui
protocol: TCP
env:
- name: ELASTICSEARCH_URL
value: http://localhost:9200
volumes:
- name: es-persistent-storage
emptyDir: {}
elasticsearch-kibana-svc.yaml:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-kibana
namespace: kube-system
labels:
k8s-app: elasticsearch-kibana
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "elasticsearch-kibana"
spec:
type: NodePort
ports:
- name: elasticsearch-http
port: 9200
protocol: TCP
targetPort: http
- name: elasticsearch-transport
port: 9300
protocol: TCP
targetPort: transport
- name: kibana
port: 5601
protocol: TCP
targetPort: ui
nodePort: 30016
selector:
k8s-app: elasticsearch-kibana
注意:这里我把elasticsearch的日志存储放在了pod的empty volumns里了,真正的运行环境应该使用nodeSelector选定一台node,并把日志存储放在node里的/usr/share/elasticsearch/data。
5. Fluentd
重点就是fluentd,addons里的依赖kube.up里引用的make ca制作认证。而我的环境进行认证时,使用了前面一同事部署heapster+influxdb+grafana时用的单域名认证,之后再进行多域名认证时,搞得乱七八糟,索性决定绕过k8s内部环境。
多域名认证参考:
https://coreos.com/kubernetes/docs/latest/openssl.html
http://www.linuxidc.com/Linux/2014-10/108222.htm
http://apetec.com/support/GenerateSAN-CSR.htm
https://certificates.heanet.ie/node/17
这里使用fabric8的fabric8/fluentd-kubernetes镜像,但需要重新制作镜像。
fabric8/fluentd-kubernetes本身不依赖https和认证,但里面的fluentd插件fluent-plugin-kubernetes_metadata_filter依赖了https和认证,这也是比较蛋疼的事。
首先pull官网的镜像
docker pull fabric8/fluentd-kubernetes:v1.14
docker tag fabric8/fluentd-kubernetes:v1.14 10.10.50.161:5000/fabric8/fluentd-kubernetes:v1.14
制作镜像:
mkdir myfluent
cd fluent-plugin
touch Dockerfile
touch start-fluentd
Dockerfile:
FROM 10.10.50.161:5000/fabric8/fluentd-kubernetes:v1.14
MAINTAINER miaobainian <miaobainian36@163.com>
ADD start-fluentd /start-fluentd
start-fluentd:
#!/bin/sh
ELASTICSEARCH_HOST=${ELASTICSEARCH_HOST:-es-logging.default.svc}
ELASTICSEARCH_PORT=${ELASTICSEARCH_PORT:-9200}
ELASTICSEARCH_SCHEME=${ELASTICSEARCH_SCHEME:-http}
FLUENTD_FLUSH_INTERVAL=${FLUENTD_FLUSH_INTERVAL:-10s}
FLUENTD_FLUSH_THREADS=${FLUENTD_FLUSH_THREADS:-1}
FLUENTD_RETRY_LIMIT=${FLUENTD_RETRY_LIMIT:-10}
FLUENTD_DISABLE_RETRY_LIMIT=${FLUENTD_DISABLE_RETRY_LIMIT:-true}
FLUENTD_RETRY_WAIT=${FLUENTD_RETRY_WAIT:-1s}
FLUENTD_MAX_RETRY_WAIT=${FLUENTD_MAX_RETRY_WAIT:-60s}
FLUENTD_BUFFER_CHUNK_LIMIT=${FLUENTD_BUFFER_CHUNK_LIMIT:-8m}
FLUENTD_BUFFER_QUEUE_LIMIT=${FLUENTD_BUFFER_QUEUE_LIMIT:-8192}
FLUENTD_BUFFER_TYPE=${FLUENTD_BUFFER_TYPE:-memory}
FLUENTD_BUFFER_PATH=${FLUENTD_BUFFER_PATH:-/var/fluentd/buffer}
FLUENTD_LOGSTASH_FORMAT=${FLUENTD_LOGSTASH_FORMAT:-true}
KUBERNETES_PRESERVE_JSON_LOG=${KUBERNETES_PRESERVE_JSON_LOG:-true}
mkdir /etc/fluent
cat << EOF >> /etc/fluent/fluent.conf
<source>
type tail
path /var/log/containers/*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%N
tag kubernetes.*
format json
read_from_head true
keep_time_key true
</source>
<filter kubernetes.**>
type kubernetes_metadata
preserve_json_log ${KUBERNETES_PRESERVE_JSON_LOG}
kubernetes_url ${KUBERNETES_URL}
verify_ssl ${VERIFY_SSL}
</filter>
<match **>
type elasticsearch$([ "${ELASTICSEARCH_DYNAMIC}" == "true" ] && echo _dynamic)
log_level info
include_tag_key true
time_key time
host ${ELASTICSEARCH_HOST}
port ${ELASTICSEARCH_PORT}
scheme ${ELASTICSEARCH_SCHEME}
$([ -n "${ELASTICSEARCH_USER}" ] && echo user ${ELASTICSEARCH_USER})
$([ -n "${ELASTICSEARCH_PASSWORD}" ] && echo password ${ELASTICSEARCH_PASSWORD})
buffer_type ${FLUENTD_BUFFER_TYPE}
$([ "${FLUENTD_BUFFER_TYPE}" == "file" ] && echo buffer_path ${FLUENTD_BUFFER_PATH})
buffer_chunk_limit ${FLUENTD_BUFFER_CHUNK_LIMIT}
buffer_queue_limit ${FLUENTD_BUFFER_QUEUE_LIMIT}
flush_interval ${FLUENTD_FLUSH_INTERVAL}
retry_limit ${FLUENTD_RETRY_LIMIT}
$([ "${FLUENTD_DISABLE_RETRY_LIMIT}" == "true" ] && echo disable_retry_limit)
retry_wait ${FLUENTD_RETRY_WAIT}
max_retry_wait ${FLUENTD_MAX_RETRY_WAIT}
num_threads ${FLUENTD_FLUSH_THREADS}
logstash_format ${FLUENTD_LOGSTASH_FORMAT}
$([ -n "${FLUENTD_LOGSTASH_PREFIX}" ] && echo logstash_prefix ${FLUENTD_LOGSTASH_PREFIX})
reload_connections false
EOF
cat << 'EOF' >> /etc/fluent/fluent.conf
</match>
EOF
exec je fluentd
注意这里比官方的start-fluentd增加关键的两行:
kubernetes_url ${KUBERNETES_URL}
verify_ssl ${VERIFY_SSL}
目的是绕过https://clusterIp:443的验证。
docker build -t 10.10.50.161:5000/fabric8/fluentd-kubernetes:v1.15 .
docker push 10.10.50.161:5000/fabric8/fluentd-kubernetes:v1.15
ok,fluentd的镜像定义好了之后,就可以生成pod了,可以使用static pod,但是推荐使用daemon sets
touch fluentd-daemon.yaml
fluentd-daemon.yaml :
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd-elasticsearch
namespace: kube-system
labels:
k8s-app: fluentd-logging
spec:
template:
metadata:
labels:
k8s-app: fluentd-logging
spec:
containers:
- name: fluentd-elasticsearch
image: 10.10.50.161:5000/fabric8/fluentd-kubernetes:v1.16
resources:
limits:
cpu: 100m
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
env:
- name: KUBERNETES_URL
value: "http://10.10.50.156:8080/api"
- name: VERIFY_SSL
value: "false"
- name: ELASTICSEARCH_HOST
value: elasticsearch-kibana
- name: ELASTICSEARCH_PORT
value: "9200"
- name: FLUENTD_FLUSH_INTERVAL
value: "300s"
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
kubectl create -f fluentd-daemon.yaml
注意这里的环境变量:
KUBERNETES_URL使用的是k8s master node的master api。
VERIFY_SSL为false表示不验证ca。
ELASTICSEARCH_HOST是前面部署的elasticsearch-kibana服务名,依赖于dns(服务名即dns名),如果没有装dns,你也可以使用kubectl get svc --namespace=kube-system找到elasticsearch-kibana的集群ip,配置集群ip也可以。但集群ip是可变的,这个要注意。
ELASTICSEARCH_PORT是elasticsearch-kibana中elasticsearch的服务内部端口。
FLUENTD_FLUSH_INTERVAL用于标识收集时间间隔,设置为300s是因为第一次收集时,花费的时间较长,时间间隔不够会导致elasticsearch不停的重新连接。
ok,现在all is already。
在浏览器打开:
http://10.10.50.155:30016/
10.10.50.155是我的k8s集群中的一个node,30016是elasticsearch-kibana服务的node port。
进行kibana的界面。
默认进入Settings的indices界面。
将Index contains time-based events的打勾去掉。点击下面的create。
ok,喝杯coffee or tea,等待k8s的日志出现就可以了。