健康检查配置
我们知道K8S本身提供了Liveness和Readiness机制来对Pod进行健康监控,同样我们在部署K8S Ingress Controller时也配置了LivenessProbe和ReadinessProbe对其进行健康检查,具体配置如下所示:
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
那么Kubelet在对Nginx Ingress Controller Pod进行定期健康检查时,就会通过HTTP协议发送GET请求,类似于如下请求:
curl -XGET http://<NGINX_INGRESS_CONTROLLER_POD_ID>:10254/healthz
健康检查成功则会返回ok
,检查失败则返回失败信息。
原理剖析
那么当Kubelet发起对Ingress Controller Pod的健康检查时,Nginx Ingress Controller内部到底做了什么,以及为什么是10254端口和/healthz路径,今天我们简单剖析下K8S Ingress Controller内部的健康检查逻辑。
1、10254 和 /healthz
首先,Nginx Ingress Controller在启动时会通过goroutine启动一个HTTP Server:
// 初始化一个 HTTP Request Handler
mux := http.NewServeMux()
go registerHandlers(conf.EnableProfiling, conf.ListenPorts.Health, ngx, mux)
其中registerHandlers方法实现如下:
func registerHandlers(enableProfiling bool, port int, ic *controller.NGINXController, mux *http.ServeMux) {
// 注册健康检查Handler
healthz.InstallHandler(mux,
healthz.PingHealthz,
ic,
)
// 用于Prometheus抓取metrics信息
mux.Handle("/metrics", promhttp.Handler())
// 获取当前Ingress Controller版本信息
mux.HandleFunc("/build", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
b, _ := json.Marshal(version.String())
w.Write(b)
})
// 主动停止Ingress Controller Pod
mux.HandleFunc("/stop", func(w http.ResponseWriter, r *http.Request) {
err := syscall.Kill(syscall.Getpid(), syscall.SIGTERM)
if err != nil {
glog.Errorf("Unexpected error: %v", err)
}
})
// 获取性能监控信息
if enableProfiling {
mux.HandleFunc("/debug/pprof/", pprof.Index)
mux.HandleFunc("/debug/pprof/heap", pprof.Index)
mux.HandleFunc("/debug/pprof/mutex", pprof.Index)
mux.HandleFunc("/debug/pprof/goroutine", pprof.Index)
mux.HandleFunc("/debug/pprof/threadcreate", pprof.Index)
mux.HandleFunc("/debug/pprof/block", pprof.Index)
mux.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline)
mux.HandleFunc("/debug/pprof/profile", pprof.Profile)
mux.HandleFunc("/debug/pprof/symbol", pprof.Symbol)
mux.HandleFunc("/debug/pprof/trace", pprof.Trace)
}
// 启动 HTTP Server
server := &http.Server{
Addr: fmt.Sprintf(":%v", port), // 指定监听Port
Handler: mux,
ReadTimeout: 10 * time.Second,
ReadHeaderTimeout: 10 * time.Second,
WriteTimeout: 300 * time.Second,
IdleTimeout: 120 * time.Second,
}
glog.Fatal(server.ListenAndServe())
}
从方法实现中,我们看到启动的HTTP Server监听的是conf.ListenPorts.Health
端口;而该端口值是在Nginx Ingress Controller启动时通过如下启动参数解析而来:
httpPort = flags.Int("http-port", 80, `Port to use for servicing HTTP traffic.`)
httpsPort = flags.Int("https-port", 443, `Port to use for servicing HTTPS traffic.`)
statusPort = flags.Int("status-port", 18080, `Port to use for exposing NGINX status pages.`)
sslProxyPort = flags.Int("ssl-passthrough-proxy-port", 442, `Port to use internally for SSL Passthrough.`)
defServerPort = flags.Int("default-server-port", 8181, `Port to use for exposing the default server (catch-all).`)
healthzPort = flags.Int("healthz-port", 10254, "Port to use for the healthz endpoint.")
因此,当我们在启动Nginx Ingress Controller时没有明确指定healthz-port
参数,那么它的默认值就是10254
端口。
另外,从上述方法中我们看到也注册了一个健康检查的Request Handler,其通过healthz.InstallHandler
方法来完成注册:
func InstallHandler(mux mux, checks ...HealthzChecker) {
// 如果没指定任何健康检查实现,那么默认仅注册PingHealthz实现
if len(checks) == 0 {
glog.V(5).Info("No default health checks specified. Installing the ping handler.")
checks = []HealthzChecker{PingHealthz}
}
glog.V(5).Info("Installing healthz checkers:", strings.Join(checkerNames(checks...), ", "))
// 注册健康检查根Handler,其内部会依次调用各个具体Handler实现
mux.Handle("/healthz", handleRootHealthz(checks...))
for _, check := range checks {
// 注册各个具体的健康检查Handler实现
mux.Handle(fmt.Sprintf("/healthz/%v", check.Name()), adaptCheckToHandler(check.Check))
}
}
从这里我们看到注册的健康检查的请求根路径就是/healthz
;当然我们也可以基于HealthzChecker
接口来扩展更多的健康检查实现。
2、健康检查机制
通过前面章节我们看到当Kubelet对Nginx Ingress Controller Pod进行健康检查时,其最终会触发其内部handleRootHealthz
方法的执行:
func handleRootHealthz(checks ...HealthzChecker) http.HandlerFunc {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
failed := false
var verboseOut bytes.Buffer
for _, check := range checks {
if err := check.Check(r); err != nil {
// don't include the error since this endpoint is public. If someone wants more detail
// they should have explicit permission to the detailed checks.
glog.V(6).Infof("healthz check %v failed: %v", check.Name(), err)
fmt.Fprintf(&verboseOut, "[-]%v failed: reason withheld\n", check.Name())
failed = true
} else {
fmt.Fprintf(&verboseOut, "[+]%v ok\n", check.Name())
}
}
// always be verbose on failure
if failed {
http.Error(w, fmt.Sprintf("%vhealthz check failed", verboseOut.String()), http.StatusInternalServerError)
return
}
if _, found := r.URL.Query()["verbose"]; !found {
fmt.Fprint(w, "ok")
return
}
verboseOut.WriteTo(w)
fmt.Fprint(w, "healthz check passed\n")
})
}
该方法内部会依次调用前面我们注册的各个具体的健康检查Handler实现,全部检查成功则会返回ok
,任一检查失败则返回失败信息。
另外从前面代码中我们看到Nginx Ingress Controller在启动时会注册两个健康检查Handler:
healthz.PingHealthz
其是HealthzChecker
接口的默认实现,实现逻辑很简单:
// PingHealthz returns true automatically when checked
var PingHealthz HealthzChecker = ping{}
// ping implements the simplest possible healthz checker.
type ping struct{}
func (ping) Name() string {
return "ping"
}
// PingHealthz is a health check that returns true.
func (ping) Check(_ *http.Request) error {
return nil
}
controller.NGINXController
其是Nginx Ingress Controller的具体代码实现,但同时其又实现了HealthzChecker
接口来完成其所管理的资源必要的健康检查:
const (
ngxHealthPath = "/healthz"
nginxPID = "/tmp/nginx.pid"
)
func (n NGINXController) Name() string {
return "nginx-ingress-controller"
}
func (n *NGINXController) Check(_ *http.Request) error {
// 1.对Nginx进行健康检查,具体访问URL:http://0.0.0.0:18080/healthz
res, err := http.Get(fmt.Sprintf("http://0.0.0.0:%v%v", n.cfg.ListenPorts.Status, ngxHealthPath))
if err != nil {
return err
}
defer res.Body.Close()
if res.StatusCode != 200 {
return fmt.Errorf("ingress controller is not healthy")
}
// 2. 若开启dynamic-configuration则检查Nginx维护在内存中的后端服务信息,访问URL:http://0.0.0.0:18080/is-dynamic-lb-initialized
if n.cfg.DynamicConfigurationEnabled {
res, err := http.Get(fmt.Sprintf("http://0.0.0.0:%v/is-dynamic-lb-initialized", n.cfg.ListenPorts.Status))
if err != nil {
return err
}
defer res.Body.Close()
if res.StatusCode != 200 {
return fmt.Errorf("dynamic load balancer not started")
}
}
// 3. 检查Nginx主进程是否正常运行中
fs, err := proc.NewFS("/proc")
if err != nil {
return errors.Wrap(err, "unexpected error reading /proc directory")
}
f, err := n.fileSystem.ReadFile(nginxPID)
if err != nil {
return errors.Wrapf(err, "unexpected error reading %v", nginxPID)
}
pid, err := strconv.Atoi(strings.TrimRight(string(f), "\r\n"))
if err != nil {
return errors.Wrapf(err, "unexpected error reading the nginx PID from %v", nginxPID)
}
_, err = fs.NewProc(pid)
return err
}
我们看到访问Nginx的端口是n.cfg.ListenPorts.Status
,其值同样来自于Nginx Ingress Controller的启动参数status-port
,默认值为18080
;
最后通过Nginx配置文件我们可以看到Nginx在启动时同时会监听18080
端口,如此我们便可通过该端口对其进行健康检查:
# used for NGINX healthcheck and access to nginx stats
server {
listen 18080 default_server backlog=511;
listen [::]:18080 default_server backlog=511;
set $proxy_upstream_name "-";
# 访问该路径直接返回200以说明Nginx能正常接收到请求
location /healthz {
access_log off;
return 200;
}
# 校验当前内存中是否正常有后端服务信息
location /is-dynamic-lb-initialized {
access_log off;
content_by_lua_block {
local configuration = require("configuration")
local backend_data = configuration.get_backends_data()
if not backend_data then
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
return
end
ngx.say("OK")
ngx.exit(ngx.HTTP_OK)
}
}
# 获取基本的监控统计信息
location /nginx_status {
set $proxy_upstream_name "internal";
access_log off;
stub_status on;
}
# 默认转发到404服务
location / {
set $proxy_upstream_name "upstream-default-backend";
proxy_pass http://upstream-default-backend;
}
}
至此,Nginx Ingress Controller健康检查机制分析完毕,总结下来其主要校验两方面:
- Nginx进程是否正常运行
- 若开启dynamic-configuration其维护在内存中的后端服务信息是否存在