容器实例的状态虽然是up,但不能保证里面的进程一定是监控的。我门可以借助HEALTHCHECK指令来做监控状态检查
HEALTHCHECK指令有两种形式:
- HEALTHCHECK [OPTIONS] CMD command:通过在容器内运行一个命令来检查容器健康情
- HEALTHCHECK NONE:禁用从base镜像继承的任何healthcheck
HEALTHCHECK指令告诉Docker如何检查容器中的进程是否工作正常。
当一个容器设置了healthcheck之后,除了正常的up状态,它多了一个healthy状态,这个状态初始为starting。当健康检查通过后,它变成了healthy(不管之前是什么状态)。当连续出现几次失败后,就变成unhealthy。
在CMD之前的选项有:
–interval=DURATION [默认30s]:检查间隔
–timeout=DURATION [默认30s]:检查时间超时
–retries=N [默认3]:重试
当容器启动之后,首先间隔interval秒然后进行健康检查,如果一个检查所花的时间超过了timeout秒,那么就认为这次检查失败了,如果连续retries次失败,就认为此容器状态为unhealthy。
可以为之前定制的MySQL5.5镜像加入这条健康检查指令:
HEALTHCHECK --interval=60s --timeout=5s CMD /usr/local/mysql/bin/mysqladmin -uroot -p$(cat /data/save/mysql_root) ping | grep alive || exit 1
来看看这时的状态变化:
[root@Docker_Machine_192.168.31.130 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
664c68483278 volumes/my_percona_server:v5.5.61 "/mysql_init.sh /usr…" 37 seconds ago Up 36 seconds (health: starting) 3306/tcp percona-server-v5.5.61_test1
[root@Docker_Machine_192.168.31.130 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
664c68483278 volumes/my_percona_server:v5.5.61 "/mysql_init.sh /usr…" About a minute ago Up About a minute (healthy) 3306/tcp percona-server-v5.5.61_test1
我们故意把里面的配置好的root密码给修改掉,这会导致mysqladmin 的ping命令无法执行,以此来验证一下结果:
[root@Docker_Machine_192.168.31.130 ~]# docker exec -it percona-server-v5.5.61_test1 /bin/bash
bash-4.2# echo 123 > /data/save/mysql_root
bash-4.2# usr/local/mysql/bin/mysqladmin -uroot -p$(cat /data/save/mysql_root) ping
usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed
error: 'Access denied for user 'root'@'localhost' (using password: YES)'
bash-4.2# exit
一段时间后 我们来查一下状态:
[root@Docker_Machine_192.168.31.130 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1e5bbc5a8bfd volumes/my_percona_server:v5.5.61.2 "/mysql_init.sh /usr…" 4 minutes ago Up 4 minutes (unhealthy) 3306/tcp percona-server-v5.5.61_test1
这时候已经变成unhealthy了
来看看inspect中的内容:
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 5587,
"ExitCode": 0,
"Error": "",
"StartedAt": "2018-10-23T03:14:40.518866729Z",
"FinishedAt": "0001-01-01T00:00:00Z",
"Health": {
"Status": "unhealthy",
"FailingStreak": 5,
"Log": [
{
"Start": "2018-10-23T11:16:40.703827739+08:00",
"End": "2018-10-23T11:16:40.820505828+08:00",
"ExitCode": 1,
"Output": "\u0007/usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed\nerror: 'Access denied for user 'root'@'localhost' (using password: YES)'\n"
},
{
"Start": "2018-10-23T11:17:41.093626922+08:00",
"End": "2018-10-23T11:17:41.221555939+08:00",
"ExitCode": 1,
"Output": "\u0007/usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed\nerror: 'Access denied for user 'root'@'localhost' (using password: YES)'\n"
},
{
"Start": "2018-10-23T11:18:41.496533988+08:00",
"End": "2018-10-23T11:18:41.633191971+08:00",
"ExitCode": 1,
"Output": "\u0007/usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed\nerror: 'Access denied for user 'root'@'localhost' (using password: YES)'\n"
},
{
"Start": "2018-10-23T11:19:41.905348235+08:00",
"End": "2018-10-23T11:19:42.024334261+08:00",
"ExitCode": 1,
"Output": "\u0007/usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed\nerror: 'Access denied for user 'root'@'localhost' (using password: YES)'\n"
},
{
"Start": "2018-10-23T11:20:42.048432466+08:00",
"End": "2018-10-23T11:20:42.174141725+08:00",
"ExitCode": 1,
"Output": "\u0007/usr/local/mysql/bin/mysqladmin: connect to server at 'localhost' failed\nerror: 'Access denied for user 'root'@'localhost' (using password: YES)'\n"
}
]
}
},
进程是出于running状态,但健康检查是unhealthy
Log的Output输出可以很清楚的显示unhealthy的原因
Docker 默认只能通过容器进程的返回码判断容器的状态,Health Check 则能够从业务角度判断应用是否有异常。