CentOS 7 DHCP (dhclient) 错误跟踪

好风 发表于 2018-04-29T00:49:06.607302Z
引用地址:https://plus.ooclab.com/note/article/1407

使用 CentOS 7 Cloud Image 创建 KVM 实例,出现奇怪的问题,etho0 网卡偶尔会失去 IP 。

参考:

环境:

由于网络使用的是 libvirtd default DHCP 服务,初步分析原因可能是虚拟机系统里面的 dhclient 没有自动 renewal 成功。

dhclient 的日志在 /var/log/messages ,查看发现:

5664 Apr 29 00:54:41 gwind-ceph-node-1 dhclient[647]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8 (xid=0x3c3a2311)
5665 Apr 29 00:54:41 gwind-ceph-node-1 dhclient[647]: DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0x3c3a2311)
5666 Apr 29 00:54:41 gwind-ceph-node-1 dhclient[647]: DHCPOFFER from 192.168.122.1
5667 Apr 29 00:54:41 gwind-ceph-node-1 dhclient[647]: DHCPACK from 192.168.122.1 (xid=0x3c3a2311)
5668 Apr 29 00:54:43 gwind-ceph-node-1 NET[694]: /usr/sbin/dhclient-script : updated /etc/resolv.conf
5669 Apr 29 00:54:43 gwind-ceph-node-1 dhclient[647]: bound to 192.168.122.94 -- renewal in 1773 seconds.
5670 Apr 29 00:54:43 gwind-ceph-node-1 network: Determining IP information for eth0... done.
5671 Apr 29 00:54:44 gwind-ceph-node-1 network: [  OK  ]
5672 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Started LSB: Bring up/down networking.
5673 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Reached target Network.
5674 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Network.
5675 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Dynamic System Tuning Daemon...
5676 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Postfix Mail Transport Agent...
5677 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting OpenSSH server daemon...
5678 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Reached target Network is Online.
5679 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Network is Online.
5680 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Notify NFS peers of a restart...
5681 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Starting Crash recovery kernel arming...
5682 Apr 29 00:54:44 gwind-ceph-node-1 sm-notify[754]: Version 1.3.0 starting
5683 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Started Notify NFS peers of a restart.
5684 Apr 29 00:54:44 gwind-ceph-node-1 systemd: Started OpenSSH server daemon.
5685 Apr 29 00:54:46 gwind-ceph-node-1 systemd: Started Postfix Mail Transport Agent.
5686 Apr 29 00:54:47 gwind-ceph-node-1 systemd: Started Dynamic System Tuning Daemon.
5687 Apr 29 00:54:47 gwind-ceph-node-1 systemd: Reached target Multi-User System.
5688 Apr 29 00:54:47 gwind-ceph-node-1 systemd: Starting Multi-User System.
5689 Apr 29 00:54:47 gwind-ceph-node-1 systemd: Starting Update UTMP about System Runlevel Changes...
5690 Apr 29 00:54:47 gwind-ceph-node-1 systemd: Started Update UTMP about System Runlevel Changes.
5691 Apr 29 00:54:49 gwind-ceph-node-1 kdumpctl: kexec: loaded kdump kernel
5692 Apr 29 00:54:49 gwind-ceph-node-1 kdumpctl: Starting kdump: [OK]
5693 Apr 29 00:54:49 gwind-ceph-node-1 systemd: Started Crash recovery kernel arming.
5694 Apr 29 00:54:49 gwind-ceph-node-1 systemd: Startup finished in 4.238s (kernel) + 3.105s (initrd) + 14.451s (userspace) = 21.795s.
5695 Apr 28 16:55:16 gwind-ceph-node-1 chronyd[490]: Selected source 69.60.116.126
5696 Apr 28 16:55:16 gwind-ceph-node-1 chronyd[490]: System clock wrong by -28799.147380 seconds, adjustment started
5697 Apr 28 16:55:16 gwind-ceph-node-1 chronyd[490]: System clock was stepped by -28799.147380 seconds
5698 Apr 28 16:55:16 gwind-ceph-node-1 systemd: Time has been changed
5699 Apr 28 16:58:30 gwind-ceph-node-1 chronyd[490]: Selected source 61.216.153.106
5700 Apr 28 16:59:23 gwind-ceph-node-1 kernel: random: crng init done
5701 Apr 28 17:01:01 gwind-ceph-node-1 systemd: Created slice User Slice of root.
5702 Apr 28 17:01:01 gwind-ceph-node-1 systemd: Starting User Slice of root.
5703 Apr 28 17:01:01 gwind-ceph-node-1 systemd: Started Session 1 of user root.

可以发现从 5664 行开始,dhclient 成功获取 IP , 但是在 5695 时间突然变成1天前,因此 dhclient 判断 IP renewal 可能出错。

解决方案:

  • Linux 虚拟机,XML 配置文件使用 UTC : <clock offset='utc' />
  • 系统使用 ntp 同步时间

参考: