有没有一种方法可以强制心跳在不完全重启的情况下向系统添加新的IP地址?


8

我们利用心跳实现高可用性。我想向心跳群集添加一个额外的IP地址,但是我不想在此过程中完全重启群集。我是否可以发送到心跳信号,促使它重新解析“资源”文件并对其执行操作?心跳-r似乎无法解决问题。

Answers:


6

问题是我在执行“ heartbeat -r”(运行“ service heartbeat reload”时在init.d脚本中执行的命令)之后没有等待足够长的时间。几分钟后,IP显示为界面符合预期。


心跳本身会应用更改吗?那实际上具有很低的吸商!如果您知道要花多长时间,请告诉我们:-)
voretaq7 '02

阅读此评论后,我意识到它相当具有误导性。我忽略了整个答案,然后重新编写。
彼得·格雷斯

嗯,这更明智-您必须触发重新加载,但这不是即时的。(而且更具确定性,这让我感到高兴。)
voretaq7'2

2

您根本不需要重新加载心跳。只需将新的IPaddr资源添加到您的haresources文件中,就像这样

IPaddr::xx.xx.xx.xx

然后开始

/etc/ha.d/resource.d/IPaddr xx.xx.xx.xx start

当然,您应该确保在活动节点上发出IPaddr start。现在,您应该能够在刚刚添加的IP地址上发送和接收流量。


我将暂缓接受自己的回答正确,因为即使我做了什么工作,您的建议听起来也相当优雅。我想尝试一下,但是如果可行,请您投票并接受接受的答案。
彼得·格雷斯

好,这是交易。我尝试了一下,低下看,它成功了!问题在于,在不重新加载心跳的情况下执行此操作将使群集处于不一致状态。我检查了源,只有三个地方心跳会重新解析haresources文件,所有这三个条件都是在请求的重新启动期间。这样,如果群集要进行故障转移和故障回复,则不会在故障转移中重新创建放置在资源中并使用IPaddr <x> start手动实例化的IP。随时可以证明我是错误的,但是依靠这种方法似乎很危险。
彼得·格雷斯

完全正确,Heartbeat不会为您保持配置文件(例如,资源)的同步-您必须设计自己的方法。在我的环境中,我们通常为此使用统一功能,并且看起来效果很好。haresources文件未缓存,因此需要读取时重新读取。资源中的任何条目都在重新启动事件(或导致读取资源的事件)时启动;这包括故障转移。
肯德尔

0

只需在辅助计算机上重新启动Hearbeat,即可避免任何与资源管理相关的停机时间。

在这种情况下,主节点检测到从属计算机“已死”,并强制执行“故障转移”,这将重新加载资源文件并启动丢失的资源。

这样做时,日志非常明确:

May  9 12:10:40 gw2 heartbeat: [3684]: info: Received shutdown notice from 'gw1'.
May  9 12:10:40 gw2 heartbeat: [3684]: info: Resources being acquired from gw1.
May  9 12:10:40 gw2 heartbeat: [26469]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
May  9 12:10:40 gw2 harc[26469]: info: Running /etc/ha.d//rc.d/status status
May  9 12:10:40 gw2 mach_down[26521]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
May  9 12:10:40 gw2 mach_down[26521]: info: mach_down takeover complete for node gw1.
May  9 12:10:40 gw2 heartbeat: [3684]: info: mach_down takeover complete.
May  9 12:10:40 gw2 heartbeat: [3684]: debug: StartNextRemoteRscReq(): child count 1
May  9 12:10:40 gw2 IPaddr2[26520]: INFO:  Running OK
May  9 12:10:40 gw2 IPaddr2[26640]: INFO:  Running OK
May  9 12:10:40 gw2 IPaddr2[26725]: INFO:  Running OK
May  9 12:10:40 gw2 IPaddr2[26805]: INFO:  Running OK
May  9 12:10:40 gw2 IPaddr2[26890]: INFO:  Resource is stopped
May  9 12:10:40 gw2 heartbeat: [26470]: info: Local Resource acquisition completed.
May  9 12:10:40 gw2 heartbeat: [3684]: debug: StartNextRemoteRscReq(): child count 1
May  9 12:10:40 gw2 heartbeat: [26953]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
May  9 12:10:40 gw2 harc[26953]: info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
May  9 12:10:40 gw2 ip-request-resp[26953]: received ip-request-resp IPaddr2::1.2.3.4 OK yes
May  9 12:10:40 gw2 ResourceManager[26976]: info: Acquiring resource group: gw2 IPaddr2::1.2.3.4
May  9 12:10:40 gw2 IPaddr2[27006]: INFO:  Resource is stopped
May  9 12:10:40 gw2 ResourceManager[26976]: info: Running /etc/ha.d/resource.d/IPaddr2 1.2.3.4 start
May  9 12:10:40 gw2 IPaddr2[27115]: INFO: ip -f inet addr add 1.2.3.4/24 brd 1.2.3.255 dev brwan
May  9 12:10:40 gw2 IPaddr2[27115]: INFO: ip link set brwan up
May  9 12:10:40 gw2 IPaddr2[27115]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-1.2.3.4 brwan 1.2.3.4 auto not_used not_used
May  9 12:10:40 gw2 IPaddr2[27091]: INFO:  Success

May  9 12:10:47 gw2 heartbeat: [3684]: WARN: node gw1: is dead
May  9 12:10:47 gw2 heartbeat: [3684]: info: Dead node gw1 gave up resources.
May  9 12:10:47 gw2 heartbeat: [3684]: info: Link gw1:eth0 dead.

May  9 12:10:59 gw2 heartbeat: [3684]: info: Heartbeat restart on node gw1
May  9 12:10:59 gw2 heartbeat: [3684]: info: Link gw1:eth0 up.
May  9 12:10:59 gw2 heartbeat: [3684]: info: Status update for node gw1: status init
May  9 12:10:59 gw2 heartbeat: [3684]: info: Status update for node gw1: status up
May  9 12:10:59 gw2 heartbeat: [3684]: debug: StartNextRemoteRscReq(): child count 1
May  9 12:10:59 gw2 heartbeat: [28604]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
May  9 12:10:59 gw2 heartbeat: [3684]: debug: get_delnodelist: delnodelist= 
May  9 12:10:59 gw2 harc[28604]: info: Running /etc/ha.d//rc.d/status status
May  9 12:10:59 gw2 heartbeat: [3684]: info: Status update for node gw1: status active
May  9 12:10:59 gw2 heartbeat: [3684]: debug: StartNextRemoteRscReq(): child count 1
May  9 12:10:59 gw2 heartbeat: [28619]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
May  9 12:10:59 gw2 harc[28619]: info: Running /etc/ha.d//rc.d/status status
May  9 12:10:59 gw2 heartbeat: [28634]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL
May  9 12:10:59 gw2 harc[28634]: info: Running /etc/ha.d//rc.d/status status
May  9 12:11:00 gw2 heartbeat: [3684]: info: remote resource transition completed.
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.