设置本地第2层NTP服务器

9

我正在尝试在没有（也永远不会）互联网连接的本地网络上设置NTP。主要优先级是网络上的机器彼此同步，即使它们同步的时间不是100％准确。

我们还要求使用NTP层次结构来复制已部署系统的设置。我想要做的是拥有这样的机器层次结构：

Moon  (Main Server running Windows) (10.1.3.10)
|____Earth   (Linux x64 client) (10.1.3.1)
|____Mars    (Linux x64 client) (10.1.3.2)
|____Saturn  (Linux x64 client) (10.1.3.3)
|____RackCard23   (Linux x64 client and server to the two machines below)  (10.1.3.23)
     |___RackCard21   (Linux x64 client) (10.1.4.21)
     |___RackCard22   (Linux x64 client) (10.1.4.22)

请注意，机架卡有两个以太网端口，一个连接到10.1.3.x网络，另一个连接到10.1.4.x网络。与主服务器Moon同步的RackCard23将在10.1.3.x网络上进行同步，RackCard22 / 23将连接到10.1.4.x网络上的RackCard23。这是因为我不希望RackCards22 / 23离开他们的网络来同步时间，并且因为它复制了最终部署的系统。

到目前为止，我已经设法通过同步Moon来正确同步（包括RackCard23），从而获得了应有的一切。

但是我很难让RackCard22和23同步RackCard23。

[root@RackCard23]# cat /etc/ntp.conf
# NTP Deamon Configuration File "ntp.conf"
# Created on 27/04/2010
# Original backed-up as "ntp.conf.backup"

server 10.1.3.10 iburst minpoll 4 maxpoll 4 prefer #This is what we want to happen
fudge   127.127.1.0 stratum 2   #Not sure about these two lines, was trying to force it to be a stratum 2 server
fudge   127.127.0.1 stratum 2

# Drift file.  Put this in a directory which the daemon can write to.
# No symbolic links allowed, either, since the daemon updates the file
# by creating a temporary in the same directory and then rename()'ing
# it to the file.
driftfile /var/lib/ntp/drift
restrict 10.1.3.10 mask 255.255.255.255 nomodify notrap noquery

#Attempt to get to act as an NTP Server
broadcast 10.1.4.255

restrict 10.1.3.21 mask 255.255.255.255 nomodify notrap
restrict 10.1.4.21 mask 255.255.255.255 nomodify notrap

这是ntptrace的输出：

[rootRackCard23]# /usr/sbin/ntptrace
localhost.localdomain: stratum 16, offset 0.000000, synch distance 0.000030

如您所见，尽管机器已同步到“第1层”服务器（Moon），但它仍将自己报告为第16层服务器：

[root@RackCard23 awd]# /usr/sbin/ntpdate -d 10.1.3.10
21 Jun 13:55:09 ntpdate[19410]: ntpdate 4.2.2p1@1.1570-o Tue May 19 13:57:56 UTC 2009 (1)
Looking for host 10.1.3.10 and service ntp
host found : 10.1.3.10
transmit(10.1.3.10)
receive(10.1.3.10)
transmit(10.1.3.10)
receive(10.1.3.10)
transmit(10.1.3.10)
receive(10.1.3.10)
transmit(10.1.3.10)
receive(10.1.3.10)
transmit(10.1.3.10)
server 10.1.3.10, port 123
stratum 1, precision -6, leap 00, trust 000
refid [LOCL], delay 0.04135, dispersion 0.00383
transmitted 4, in filter 4
reference time:    cfc99402.e010624d  Mon, Jun 21 2010  8:32:18.875
originate timestamp: cfc9dfad.48000000  Mon, Jun 21 2010 13:55:09.281
transmit timestamp:  cfc9dfad.47e27179  Mon, Jun 21 2010 13:55:09.280
filter delay:  0.04155  0.04155  0.04137  0.04135
         0.00000  0.00000  0.00000  0.00000
filter offset: -0.01448 0.000781 0.000537 0.000394
         0.000000 0.000000 0.000000 0.000000
delay 0.04135, dispersion 0.00383
offset 0.000394

21 Jun 13:55:09 ntpdate[19410]: adjust time server 10.1.3.10 offset 0.000394 sec

客户端（RackCard21 / 22）的配置如下所示：

[root@RackCard21]# cat /etc/ntp.conf
# NTP Deamon Configuration File "ntp.conf"
# Created on 27/04/2010
# Original backed-up as "ntp.conf.backup"

server 10.1.4.23 iburst minpoll 4 maxpoll 4 prefer

server 127.127.1.0
fudge   127.127.1.0 stratum 10

# Drift file.  Put this in a directory which the daemon can write to.
# No symbolic links allowed, either, since the daemon updates the file
# by creating a temporary in the same directory and then rename()'ing
# it to the file.
driftfile /var/lib/ntp/drift

# restrict 127.0.0.1

restrict None mask 255.255.255.255 nomodify notrap noquery

ntptrace给出了这一点：

[root@RackCard21]# /usr/sbin/ntpdate -d 10.1.4.23
21 Jun 14:04:34 ntpdate[14381]: ntpdate 4.2.2p1@1.1570-o Tue May 19 13:57:56 UTC 2009 (1)
Looking for host 10.1.4.23 and service ntp
host found : 10.1.4.23
transmit(10.1.4.23)
receive(10.1.4.23)
transmit(10.1.4.23)
receive(10.1.4.23)
transmit(10.1.4.23)
receive(10.1.4.23)
transmit(10.1.4.23)
receive(10.1.4.23)
transmit(10.1.4.23)
10.1.4.23: Server dropped: strata too high
server 10.1.4.23, port 123
stratum 16, precision -20, leap 11, trust 000
refid [10.1.4.23], delay 0.02568, dispersion 0.00000
transmitted 4, in filter 4
reference time:    00000000.00000000  Thu, Feb  7 2036  6:28:16.000
originate timestamp: cfc9dfef.12b79516  Mon, Jun 21 2010 13:56:15.073
transmit timestamp:  cfc9e1e2.aeae7d56  Mon, Jun 21 2010 14:04:34.682
filter delay:  0.02573  0.02571  0.02568  0.02568
         0.00000  0.00000  0.00000  0.00000
filter offset: -499.609 -499.609 -499.609 -499.609
         0.000000 0.000000 0.000000 0.000000
delay 0.02568, dispersion 0.00000
offset -499.609286

21 Jun 14:04:34 ntpdate[14381]: no server suitable for synchronization found

因此它找不到合适的服务器，因为我要使用的服务器报告它是第16层服务器（我认为这是未同步的）。尽管事实上它是同步的。

因此，我需要以某种方式使RackCard23成为更高的层次（理想情况下是层次2）。我该怎么做呢？

非常感谢您的帮助，因为我已经尝试了好几天了！

编辑：

你好克里斯托弗，

我一直在重新启动ntpd，是;）

所有的Linux机器都运行CentOS 5.4。

这是您建议的命令的输出。首先从服务器：

[root@RackCard23]# /usr/sbin/ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 10.1.3.10       .INIT.          16 u    -   16    0    0.000    0.000   0.000
 10.1.4.255      .BCST.          16 u    -   64    0    0.000    0.000   0.001

[root@RackCard23]# /usr/sbin/ntpdc -c monlist
remote address          port local address      count m ver code avgint  lstint
===============================================================================
localhost.localdomain  34566 127.0.0.1              1 7 2      0      0       0
10.1.4.21                123 10.1.4.23              5 3 4    180      5       1
10.1.4.22                123 10.1.4.23              7 3 4      0      2       2

然后从客户端：

[root@RackCard21]# /usr/sbin/ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 10.1.4.23       .INIT.          16 u   10   16    0    0.000    0.000   0.000
 LOCAL(0)        .LOCL.          10 l   44   64    1    0.000    0.000   0.001

ntp

— fwgx
source

如果您没有互联网连接，您的时间来源是什么，我是否想念某个地方？

— dbasnett 2010年

时间来源并不重要，因为我们不会精确到100％的时间。我们要做的是使所有机器彼此同步，即使这意味着它们的时间比实际时间少10分钟以上。因此，我们在网络上使用随机计算机作为主时间源，即仅使用其内部时钟。我们知道并接受的是不可靠的，但是只要事情同步，对我们来说就可以了。在实际部署的系统中，我们将同步到另一个我们无法控制的系统上的时间源，这可能会更准确，也可能会不准确。

— fwgx

5

正如克里斯所提到的，层16指示服务器实际上尚未与服务器同步。可以肯定的是，您确实重新启动了ntp服务，对吗？（service ntpd restart）我并不是想暗示您想念那些简单的东西，但我总是这样做！

您可以发布更多命令的输出来帮助诊断吗？

ntpq -p在客户端和服务器上。应该显示其已配置的服务器以及这些服务器的统计信息。
ntpdc -c monlist在服务器上。应该显示客户端已连接。

另外，由于您没有提到操作系统，因此我正在使用RHEL样式命令运行。让我知道您是否有其他不同之处。

进一步的信息后编辑
确定，看到您的输出，这是您的问题：您没有第1层服务器。实际上，“月亮”正在使用其本地时钟。它将自己报告为第16层服务器。供您参考，Stratum1服务器将具有本地GPS或原子钟。你有其中之一吗？否则，Moon需要将其时钟与另一台ntp服务器同步。如果没有网络访问权限，则需要弄乱它的层。（这要求您不要太在乎“真实”时间。您不需要，但其他阅读此时间的人应该注意这一点。）

在月亮下面一行添加到您的ntp.conf文件：fudge 127.127.1.0 stratum 10。这将使其报告其本地时钟为第10层。这将使所有其他服务器在其本地第16层时钟上使用它。

克里斯托弗·卡雷尔（Christopher Karel）

— 克里斯托弗·卡雷尔（Christopher Karel）
source

将结果添加到主要问题帖子中。

— fwgx

与克里斯托弗同意。关于Strata的很多误解 ntp.org/ntpfaq/NTP-s-algo.htm

— dbasnett 2010年

3

可能没有主题，本地Stratum 2服务器需要连接到Stratum 1服务器，并且在隔离的网络中，您没有服务器。

您可以获得便宜的GPS模块和Raspberry Pi，这是一种单板计算机，具有最低的功耗和足够的接口功能。将您的GPS模块连接到Raspberry Pi上，并使用适当的软件将Pi加入到您的网络中，它可以是Stratum 2服务器的Stratum 1 NTP服务器，或者由于您在每台计算机的网络中都可以同步时间。

2

NTPd将根据以下内容设置自己的层：

如果尚未评估本地时钟的漂移，请将阶层设置为16。在正常服务器上，此过程大约需要15分钟，然后继续进行下一步。
连接到所有已配置的时间服务器，评估哪些是可靠的（因此是首选），将本地层设置为最低可靠服务器的层加一。因此，如果找到的最低可靠服务器为1，则本地服务器为2。

（这不一定是事件的顺序，而是为了设置本地层的目的而处理事件的顺序。）
（此外，层16不一定表示它是不同步的）。

— 克里斯·S
source

1

可能是因为Moon是使用默认W32Time NTP服务（实际上是简单NTP（SNTP））的Windows XP Pro x64计算机，所以RackCard23不会将其视为正确的NTP服务器，因此永远不会将其层次设置为其他任何东西比16？

— fwgx

D'哦，我在编辑帖子之前没有看到此内容。这很有可能。有什么理由不在层次结构的顶部使用适当的ntp客户端？（基于Windows或Unix）

— Christopher Karel

2

顺便说一句，我将对您的ntpq输出进行一些分析。只是为了帮助您自己和他人将来进行一般故障排除。

首先，从您的服务器：

[root@RackCard23]# /usr/sbin/ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 10.1.3.10       .INIT.          16 u    -   16    0    0.000    0.000   0.000
 10.1.4.255      .BCST.          16 u    -   64    0    0.000    0.000   0.001

第一列指示此计算机配置为同步到的两台服务器。值得注意的是，缺少*或+将表明已同步的同级或次要候选者。这意味着您的服务器将不会使用此处的条目，但至少要对其进行检查。

第三列“ st”指示这些服务器的层。在这种情况下，表明这两个计算机都在使用其本地时钟。（默认层数为16）最后三列将指示两个时钟相距多远。要么是“秒数差异”值，要么是两台机器之间的延迟，即该延迟的差异。在这里，更高的数字更糟。

出现此类不同步条目的原因可能取决于某些因素：如果时钟的偏移量太大，则ntp甚至不会尝试，因为它会导致本地时间的跳升太大。如果抖动恶化，客户端将取消同步，直到情况稳定为止。（这通常是临时的，但还是会重复发生）或者，如您的情况，如果所配置的服务器具有相等或更高的层值，表明它们作为时间源的可靠性较差，则客户端将不会使用它们。

克里斯托弗·卡雷尔（Christopher Karel）

— 克里斯托弗·卡雷尔（Christopher Karel）
source