背景描述:prometheus服务器总是出现两个小时内断开外部连接,导致prometheus和grafana提供的web服务无法访问,ssh工具连不上机器,故选择了重启实例,可是在重启实例后再次出现此般状况,
故对服务器系统进行排查,经排查后定位到实例的网卡出现了以下问题:
[root@prometheus /var/log]# systemctl status network ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled) Active: failed (Result: exit-code) since 二 2021-08-17 18:02:07 CST; 7h left Docs: man:systemd-sysv-generator(8) Process: 1172 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE) CGroup: /system.slice/network.service └─1359 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--ens5.lease -pf /var/run/dhclient-ens5.pid -H prometheus ens5 8月 17 18:02:05 prometheus dhclient[1301]: DHCPACK from 172.31.32.1 (xid=0x41460a8e) 8月 17 18:02:07 prometheus dhclient[1301]: bound to 172.31.44.100 -- renewal in 1762 seconds. 8月 17 18:02:07 prometheus network[1172]: Determining IP information for ens5... done. 8月 17 18:02:07 prometheus network[1172]: [ OK ] 8月 17 18:02:07 prometheus network[1172]: Bringing up interface eth0: ERROR : [/etc/sysconfig/network-scripts/ifup-eth] Device eth0 does not seem to be present, delaying initialization. 8月 17 18:02:07 prometheus network[1172]: [FAILED] 8月 17 18:02:07 prometheus systemd[1]: network.service: control process exited, code=exited status=1 8月 17 18:02:07 prometheus systemd[1]: Failed to start LSB: Bring up/down networking. 8月 17 18:02:07 prometheus systemd[1]: Unit network.service entered failed state. 8月 17 18:02:07 prometheus systemd[1]: network.service failed. 第一次发现问题后经过重启网络发现如下: [root@prometheus /var/log]# systemctl restart network Job for network.service failed because the control process exited with error code. See "systemctl status network.service" and "journalctl -xe" for details. [root@prometheus /var/log]# systemctl status network ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled) Active: failed (Result: exit-code) since 二 2021-08-17 10:13:20 CST; 9s ago Docs: man:systemd-sysv-generator(8) Process: 11979 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE) CGroup: /system.slice/network.service └─1359 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--ens5.lease -pf /var/run/dhclient-ens5.pid -H prometheus ens5 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus network[11979]: RTNETLINK answers: File exists 8月 17 10:13:20 prometheus systemd[1]: network.service: control process exited, code=exited status=1 8月 17 10:13:20 prometheus systemd[1]: Failed to start LSB: Bring up/down networking. 8月 17 10:13:20 prometheus systemd[1]: Unit network.service entered failed state. 8月 17 10:13:20 prometheus systemd[1]: network.service failed.
故在网上寻找了有关"Device eth0 does not seem to be present, delaying initialization","RTNETLINK answers: File exists"两个问题的解决;
故在网上寻找了有关"Device eth0 does not seem to be present, delaying initialization","RTNETLINK answers: File exists"两个问题的解决; 网上的方法大致如下: 第一种:和 NetworkManager 服务有冲突,直接关闭 NetworkManger 服务, service NetworkManager stop,并且禁止开机启动 chkconfig NetworkManager off 。之后重启。(我尝试了,发现机器里并没有NetworkManager服务,故不可) 第二种:和配置文件的MAC地址不匹配,修改 /etc/udev/rules.d/70-persistent-net.rules文件的MAC地址和 /etc/sysconfig/network-scripts/ifcfg-eth5一样。(我尝试了,aws实例中并没有网卡配置并没有MAC地址,故不可) 第三种:ip addr flush dev eth5(尝试后,未能解决)。
问题解决:
在发现主机ip绑定的是ens5这张网卡,而报错中出现"Device eth0 does not seem to be present" 推测可能是aws中eth0这张网卡影响了eth5网卡的启动,后经操作重启网卡成功;操作如下: [root@prometheus /etc/sysconfig/network-scripts]# mv ifcfg-eth0 /root [root@prometheus /etc/sysconfig/network-scripts]# kill -9 1359 [root@prometheus /etc/sysconfig/network-scripts]# rm -rf /var/lib/dhclient/dhclient--ens5.lease /var/run/dhclient-ens5.pid /etc/udev/rules.d/70-persistent-net.rules [root@prometheus /etc/sysconfig/network-scripts]# systemctl restart network(这里已经重启成功) [root@prometheus /etc/sysconfig/network-scripts]# systemctl status network ● network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled) Active: active (running) since 二 2021-08-17 11:34:47 CST; 8min ago Docs: man:systemd-sysv-generator(8) Process: 25605 ExecStop=/etc/rc.d/init.d/network stop (code=exited, status=0/SUCCESS) Process: 25782 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=0/SUCCESS) CGroup: /system.slice/network.service └─25966 /sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient--ens5.lease -pf /var/run/dhclient-ens5.pid -H prometheus ens5 8月 17 11:34:45 prometheus systemd[1]: Starting LSB: Bring up/down networking... 8月 17 11:34:45 prometheus network[25782]: Bringing up loopback interface: [ OK ] 8月 17 11:34:45 prometheus network[25782]: Bringing up interface ens5: 8月 17 11:34:45 prometheus dhclient[25911]: DHCPREQUEST on ens5 to 255.255.255.255 port 67 (xid=0x3567fc41) 8月 17 11:34:45 prometheus dhclient[25911]: DHCPACK from 172.31.32.1 (xid=0x3567fc41) 8月 17 11:34:47 prometheus dhclient[25911]: bound to 172.31.44.100 -- renewal in 1363 seconds. 8月 17 11:34:47 prometheus network[25782]: Determining IP information for ens5... done. 8月 17 11:34:47 prometheus network[25782]: [ OK ] 8月 17 11:34:47 prometheus systemd[1]: Started LSB: Bring up/down networking.
linux启动网卡之"Device eth0 does not seem to be present, delaying initialization"错误解决方法