我在带有两个板载Broadcom NetXtreme II BCM5708 1000Base-T NIC的HP ML 350 G5上运行RHEL 6.4,kernel-2.6.32-358.el6.i686。我的目标是将两个接口绑定到mode=1
故障转移对中。
我的问题是,尽管有所有证据表明已建立并接受绑定,但将电缆从主NIC拔出仍会导致所有通信停止。
ifcfg-etho和ifcfg-eth1
首先,ifcfg-eth0:
DEVICE=eth0
HWADDR=00:22:64:F8:EF:60
TYPE=Ethernet
UUID=99ea681d-831b-42a7-81be-02f71d1f7aa0
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
接下来,ifcfg-eth1:
DEVICE=eth1
HWADDR=00:22:64:F8:EF:62
TYPE=Ethernet
UUID=92d46872-eb4a-4eef-bea5-825e914a5ad6
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
ifcfg-bond0
我的债券的配置文件:
DEVICE=bond0
IPADDR=192.168.11.222
GATEWAY=192.168.11.1
NETMASK=255.255.255.0
DNS1=192.168.11.1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="mode=1 miimmon=100"
/etc/modprobe.d/bonding.conf
我有一个/etc/modprobe.d/bonding.conf
这样填充的文件:
alias bond0 bonding
ip addr输出
绑定已建立,我可以通过绑定的IP地址访问服务器的公共服务:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
link/ether 00:22:64:f8:ef:60 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
link/ether 00:22:64:f8:ef:60 brd ff:ff:ff:ff:ff:ff
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 00:22:64:f8:ef:60 brd ff:ff:ff:ff:ff:ff
inet 192.168.11.222/24 brd 192.168.11.255 scope global bond0
inet6 fe80::222:64ff:fef8:ef60/64 scope link
valid_lft forever preferred_lft forever
绑定内核模块
...已加载:
# cat /proc/modules | grep bond
bonding 111135 0 - Live 0xf9cdc000
/ sys / class / net
该/sys/class/net
文件系统表现出良好的东西:
cat /sys/class/net/bonding_masters
bond0
cat /sys/class/net/bond0/operstate
up
cat /sys/class/net/bond0/slave_eth0/operstate
up
cat /sys/class/net/bond0/slave_eth1/operstate
up
cat /sys/class/net/bond0/type
1
/ var / log / messages
日志文件中没有任何关注的内容。实际上,一切看起来都很高兴。
Jun 15 15:47:28 rhsandbox2 kernel: Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: setting mode to active-backup (1).
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: setting mode to active-backup (1).
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: setting mode to active-backup (1).
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: setting mode to active-backup (1).
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: Adding slave eth0.
Jun 15 15:47:28 rhsandbox2 kernel: bnx2 0000:03:00.0: eth0: using MSI
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: making interface eth0 the new active one.
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: first active interface up!
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: enslaving eth0 as an active interface with an up link.
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: Adding slave eth1.
Jun 15 15:47:28 rhsandbox2 kernel: bnx2 0000:05:00.0: eth1: using MSI
Jun 15 15:47:28 rhsandbox2 kernel: bonding: bond0: enslaving eth1 as a backup interface with an up link.
Jun 15 15:47:28 rhsandbox2 kernel: 8021q: adding VLAN 0 to HW filter on device bond0
Jun 15 15:47:28 rhsandbox2 kernel: bnx2 0000:03:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
Jun 15 15:47:28 rhsandbox2 kernel: bnx2 0000:05:00.0: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
所以有什么问题?!
从eth0拔出网络电缆会导致所有通信消失。这可能是什么问题,应该采取什么进一步的步骤进行故障排除?
编辑:
进一步的故障排除:
该网络是ProCurve 1800-8G交换机提供的单个子网,单个VLAN。我已经加入primary=eth0
到ifcfg-bond0
并重新启动网络服务,但是这并没有改变任何行为。我/sys/class/net/bond0/bonding/primary
在添加之前和之后都进行了检查,primary=eth1
并且它具有null值,我不确定是好是坏。
拖尾/var/log/messages
何时eth1
去除电缆仅显示以下内容:
Jun 15 16:51:16 rhsandbox2 kernel: bnx2 0000:03:00.0: eth0: NIC Copper Link is Down
Jun 15 16:51:24 rhsandbox2 kernel: bnx2 0000:03:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex
我use_carrier=0
在ifcfg-bond0
的BONDING_OPTS
部分中添加了启用MII / ETHTOOL ioctl的功能。重新启动网络服务后,症状没有任何变化。拔出电缆eth0
会导致所有网络通信停止。再一次,/var/log/messages
除了该端口上的链接断开的通知外,没有任何错误。
up
。/var/log/messages
拔出eth0时的拖尾仅显示铜线链路已拔出。没有来自绑定模块的消息。