我有一个运行mysql和wordpress的小型Web服务器,一段时间后似乎停止处理Web请求.我甚至无法通过ssh登录到服务器,因为当尝试建立连接时ssh客户端超时,唯一的方法是恢复服务器是进行硬重启.
我让ssh在10个小时的时间内运行,看着这台服务器慢慢死了,当它到达那里时,它似乎卡住顶部仍在工作.我能够退出关闭mysql和httpd然后重复键入正常运行时间,并且在关闭httpd和mysqld后的10分钟内负载平均值从101.73变为0.01.
我提供了以下我可以收集的数据.
我的问题:
>数据的含义是什么?
>这台机器是CPU还是RAM?
>更大的盒子会解决问题吗?
>还可以使用哪些其他工具来确定此问题的原因.
在我退出并关闭httpd和mysqld之前,这是top的快照
top - 11:00:18 up 13:54, 1 user, load average: 96.13, 94.78, 90.06
Tasks: 173 total, 1 running, 172 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.5%us, 1.1%sy, 0.0%ni, 0.0%id, 98.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1016284k total, 1008232k used, 8052k free, 580k buffers
Swap: 2096440k total, 2095168k used, 1272k free, 9872k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7 root 20 0 0 0 0 S 0.2 0.0 0:09.98 events/0
18 root 20 0 0 0 0 S 0.1 0.0 0:11.66 kblockd/0
1267 root 20 0 114m 316 232 S 0.1 0.0 0:00.41 crond
4779 apache 20 0 270m 11m 548 D 0.1 1.2 0:00.68 httpd
4878 apache 20 0 261m 17m 896 D 0.1 1.8 0:00.44 httpd
5046 apache 20 0 272m 19m 1168 D 0.1 1.9 0:00.69 httpd
5258 apache 20 0 244m 2552 1300 D 0.1 0.3 0:00.01 httpd
...... stuff I have removed to make this list short
1532 root 20 0 105m 4 4 S 0.0 0.0 0:00.01 mysqld_safe
1634 mysql 20 0 713m 8656 1612 S 0.0 0.9 1:13.79 mysqld
1805 root 20 0 244m 976 80 S 0.0 0.1 0:03.43 httpd
来自uptime命令的数据
11:01:50 up 13:55, 1 user, load average: 99.15, 95.94, 90.88
11:05:19 up 13:59, 2 users, load average: 101.73, 97.93, 92.65
11:05:45 up 13:59, 2 users, load average: 67.02, 90.07, 90.18
11:07:27 up 14:01, 2 users, load average: 11.61, 63.36, 80.53
11:07:30 up 14:01, 2 users, load average: 11.61, 63.36, 80.53
11:07:35 up 14:01, 2 users, load average: 10.68, 62.31, 80.10
11:07:39 up 14:01, 2 users, load average: 9.83, 61.28, 79.67
11:07:41 up 14:01, 2 users, load average: 9.04, 60.26, 79.24
11:07:43 up 14:01, 2 users, load average: 9.04, 60.26, 79.24
11:07:48 up 14:01, 2 users, load average: 8.31, 59.26, 78.82
11:07:50 up 14:01, 2 users, load average: 8.31, 59.26, 78.82
11:07:52 up 14:01, 2 users, load average: 7.65, 58.28, 78.39
11:07:54 up 14:01, 2 users, load average: 7.65, 58.28, 78.39
11:07:56 up 14:01, 2 users, load average: 7.65, 58.28, 78.39
11:07:57 up 14:02, 2 users, load average: 7.04, 57.31, 77.97
11:07:58 up 14:02, 2 users, load average: 7.04, 57.31, 77.97
11:08:04 up 14:02, 2 users, load average: 6.47, 56.36, 77.55
11:08:05 up 14:02, 2 users, load average: 6.47, 56.36, 77.55
11:08:06 up 14:02, 2 users, load average: 5.95, 55.42, 77.14
11:08:08 up 14:02, 2 users, load average: 5.95, 55.42, 77.14
11:08:09 up 14:02, 2 users, load average: 5.95, 55.42, 77.14
11:08:10 up 14:02, 2 users, load average: 5.95, 55.42, 77.14
11:08:11 up 14:02, 2 users, load average: 5.48, 54.50, 76.72
11:08:12 up 14:02, 2 users, load average: 5.48, 54.50, 76.72
11:08:14 up 14:02, 2 users, load average: 5.48, 54.50, 76.72
11:08:15 up 14:02, 2 users, load average: 5.48, 54.50, 76.72
11:08:16 up 14:02, 2 users, load average: 5.04, 53.60, 76.31
11:08:17 up 14:02, 2 users, load average: 5.04, 53.60, 76.31
11:08:19 up 14:02, 2 users, load average: 5.04, 53.60, 76.31
11:08:20 up 14:02, 2 users, load average: 5.04, 53.60, 76.31
11:08:22 up 14:02, 2 users, load average: 4.63, 52.70, 75.90
11:08:23 up 14:02, 2 users, load average: 4.63, 52.70, 75.90
11:08:25 up 14:02, 2 users, load average: 4.63, 52.70, 75.90
11:08:26 up 14:02, 2 users, load average: 4.26, 51.83, 75.49
11:08:27 up 14:02, 2 users, load average: 4.26, 51.83, 75.49
11:08:28 up 14:02, 2 users, load average: 4.26, 51.83, 75.49
11:08:29 up 14:02, 2 users, load average: 4.26, 51.83, 75.49
11:08:33 up 14:02, 2 users, load average: 3.92, 50.97, 75.09
11:08:36 up 14:02, 2 users, load average: 3.61, 50.12, 74.68
11:08:38 up 14:02, 2 users, load average: 3.61, 50.12, 74.68
11:08:40 up 14:02, 2 users, load average: 3.61, 50.12, 74.68
11:08:41 up 14:02, 2 users, load average: 3.32, 49.29, 74.28
11:09:11 up 14:03, 2 users, load average: 2.01, 44.58, 71.92
11:09:13 up 14:03, 2 users, load average: 2.01, 44.58, 71.92
11:09:24 up 14:03, 2 users, load average: 1.70, 43.11, 71.15
11:09:25 up 14:03, 2 users, load average: 1.70, 43.11, 71.15
11:10:41 up 14:04, 2 users, load average: 0.48, 33.53, 65.62
11:10:43 up 14:04, 2 users, load average: 0.44, 32.98, 65.27
11:10:53 up 14:04, 2 users, load average: 0.38, 31.89, 64.57
11:10:55 up 14:04, 2 users, load average: 0.38, 31.89, 64.57
11:11:38 up 14:05, 2 users, load average: 0.18, 27.43, 61.51
11:11:40 up 14:05, 2 users, load average: 0.18, 27.43, 61.51
11:11:41 up 14:05, 2 users, load average: 0.18, 27.43, 61.51
11:11:41 up 14:05, 2 users, load average: 0.16, 26.97, 61.18
11:11:42 up 14:05, 2 users, load average: 0.16, 26.97, 61.18
11:11:43 up 14:05, 2 users, load average: 0.16, 26.97, 61.18
11:11:45 up 14:05, 2 users, load average: 0.16, 26.97, 61.18
11:12:06 up 14:06, 2 users, load average: 0.10, 24.80, 59.56
11:12:10 up 14:06, 2 users, load average: 0.10, 24.80, 59.56
11:14:30 up 14:08, 2 users, load average: 0.01, 15.52, 51.21
11:14:37 up 14:08, 2 users, load average: 0.01, 15.00, 50.66
解决方法:
如果你在顶部输出中查看这些行:
Mem: 1016284k total, 1008232k used, 8052k free, 580k buffers
Swap: 2096440k total, 2095168k used, 1272k free, 9872k cached
你已经用完了RAM和交换.我怀疑如果你观看vmstat 10输出,你会发现机器正在挣扎.
运行MySQL和Apache的机器应该几乎没有交换使用.我怀疑您需要更改MySQL设置以匹配可用内存(例如,更少的查询缓存,更小的innodb池等).您还可以降低允许的Apache子级的最大数量.或者你可能有一个使用大量内存的失控脚本(PHP等)(通过RSS排序你的顶部).