为什么systemd在重新启动期间挂起?


13

10次​​中的1次,systemd在重新引导期间挂起。我不明白原因。我应该在哪里/什么地方解决该问题?我正在使用systemd v196,无法将其升级到版本> = 198,因为后者需要最新的内核(支持cgroups),无法根据客户要求进行更新。我想知道是否有一种合理的方法来发现此行为的原因并使systemd无条件重新启动系统。

请注意,此链接无济于事:http : //freedesktop.org/wiki/Software/systemd/Debugging/#index2h1

如您所见:

关机永远不会完成

如果即使等待了几分钟后仍无法正常重启或关闭电源,则上述创建关机日志的方法将无济于事,必须使用其他方法来获取该日志。对调试启动问题有用的两个选项也可以用于关机问题:

use a serial console
use a debug shell - not only is it available from early boot, it also stays active until late shutdown.

我正在使用串行控制台,由于某种原因,我什至可以登录,因为eth界面已启动或已启动(在重新启动步骤期间发生断开连接后)。

我不知道原因。

# cat /etc/systemd/system/
basic.target.wants/                          getty.target.wants/                          multi-user.target.wants/                     sysinit.target.wants/                        
dbus-org.freedesktop.NetworkManager.service  local-fs-pre.target.wants/                   sockets.target.wants/                        syslog.service                               
display-manager.service                      local-fs.target.wants/                       swap.target

注意swap.target。它在那里,但我们根本不使用交换分区。我试图掩盖交换,但是挂起问题仍然存在。控制台的最后一行是:

[OK] Stopped target shutdown.

编辑:正如我所说,我可以通过eth重新通过ssh登录。

现在,我将向您显示两个日志。第一个日志在重新启动/ shutdwon挂起时发生,而第二个日志在重新启动成功时:

挂起情况下,输出始终是这样的(完整日志):

[  OK  ] Stopped Network Time Service (one-shot ntpdate mode).
         Stopping Modem and VPN connections autoconnect...
         Stopping Login Service...
         Stopping LSB: Avahi mDNS/DNS-SD Daemon...
[  OK  ] Stopped Monitoring free system resources.
[  OK  ] Stopped Monitoring dropbear socket.
[  OK  ] Stopped Login Service.
[  OK  ] Stopped Modem and VPN c[  OK  ] Stopped Getty on tty1.
[  OK  ] Stopped Serial Getty on ttyO0.
[  OK  ] Unmounted /var/lib/opkg.
[  OK  ] Stopped Network Manager.
[  OK  ] Stopped LSB: Avahi mDNS/DNS-SD Daemon.
         Stopping D-Bus System Message Bus...
[  OK  ] Stopped target Remote File Systems.
[  OK  ] Stopped Suspend manager.
         Stopping X Server...
[  OK  ] Stopped X Server.
         Stopping System Logging Service...
[  OK  ] Stopped System Logging Service.
[   77.580000] g_ether gadget: using random self ethernet address
[   77.580000] g_ether gadget: using random host ethernet address
[   77.590000] usb0: MAC 6e:0d:de:b0:33:4f
[   77.590000] usb0: HOST MAC 62:7a:81:02:f3:ff
[   77.600000] g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
[   77.600000] g_ether gadget: g_ether ready
[   77.610000] musb-hdrc musb-hdrc.0: MUSB HDRC host driver
[   77.610000] musb-hdrc musb-hdrc.0: new USB bus registered, assigned bus number 2
[   77.620000] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[   77.630000] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[   77.640000] usb usb2: Product: MUSB HDRC host driver
[   77.640000] usb usb2: Manufacturer: Linux 2.6.37 musb-hcd
[   77.650000] usb usb2: SerialNumber: musb-hdrc.0
[   77.650000] hub 2-0:1.0: USB hub found
[   77.660000] hub 2-0:1.0: 1 port detected
[   77.690000] ADDRCONF(NETDEV_UP): usb0: link is not ready
[  OK  ] Stopped target Reboot.
[  OK  ] Stopped Reboot.
[  OK  ] Stopped target Unmount All Filesystems.
[  OK  ] Stopped target Shutdown.
[   78.330000] <46>systemd-journald[328]: Received SIGUSR1
<hang>

正常重启:

         Unmounting /var/lib/opkg...
[  OK  ] Stopped target Network.
         Stopping SSH Per-Connection Server...
[  OK  ] Stopped target Graphical Interface.
[  OK  ] Stopped target Multi-User.
         Stopping Monitoring free system resources...
         Stopping Monitoring dropbear socket...
         Stopping Network Time Service (one-shot ntpdate mode)...
[  OK  ] Stopped Network Time Service (one-shot ntpdate mode).
         Stopping Modem and VPN connections autoconnect...
         Stopping Login Service...
         Stopping LSB: Avahi mDNS/DNS-SD Daemon...
[  OK  ] Stopped Monitoring free system resources.
[  OK  ] Stopped Monitoring dropbear socket.
[  OK  ] Stopped Login Service.
[  OK  ] Unmounted /var/lib/opkg.
         Stopping Network Manager...
[  OK  ] Stopped Getty on tty1.
[  OK  ] Stopped Network Manager.
[  OK  ] Stopped Serial Getty on ttyO0.
[  OK  ] Stopped Suspend manager.
[  OK  ] Stopped LSB: Avahi mDNS/DNS-SD Daemon.
         Stopping D-Bus System Message Bus...
         Stopping X Server...
         Stopping Permit User Sessions...
[  OK  ] Stopped Permit User Sessions.
[  OK  ] Stopped target Remote File Systems.
[  OK  ] Stopped X Server.
[  OK  ] Stopped D-Bus System Message Bus.
         Stopping System Logging Service...
[  OK  ] Stopped System Logging Service.
[  OK  ] Stopped target Basic System.
[  OK  ] Stopped target Sockets.
[  OK  ] Closed dropbear.socket.
[  OK  ] Closed D-Bus System Message Bus Socket.
[  OK  ] Stopped target System Initialization.
         Stopping Import configuration from SD card...
[  OK  ] Stopped Import configuration from SD card.
         Stopping Load Kernel Modules...
         Stopping Apply Kernel Variables...
[  OK  ] Stopped Apply Kernel Variables.
[  OK  ] Stopped target Local File Systems.
         Unmounting /var...
         Unmounting /tmp...
[  OK  ] Closed Syslog Socket.
[  OK  ] Failed unmounting /var.
[  OK  ] Unmounted /tmp.
[  OK  ] Stopped Load Kernel Modules.
[  OK  ] Reached target Unmount All Filesystems.
[  OK  ] Stopped target Local File Systems (Pre).
         Stopping Remount Root and Kernel File Systems...
[  OK  ] Stopped Remount Root and Kernel File Systems.
[  OK  ] Reached target Shutdown.
[   52.340000] omap_wdt: Unexpected close, not stopping!
Sending SIGTERM to remaining processes...
[   52.490000] <46>systemd-journald[335]: Received SIGTERM
Sending SIGKILL to remaining processes...
Unmounting file systems.
Unmounting /sys/fs/fuse/connections.
Unmounting /var.
All filesystems unmounted.
Deactivating swaps.
All swaps deactivated.

更新:

经过一些调查和调试,我发现了关机中断的原因,尽管我仍然无法解决。发生的原因是由于某些原因,在关闭完成之前启动了一个自定义服务,这使关闭过程挂起。那是死刑的一种情况。挂起的另一种情况是关闭未中断但在某个时刻停止。因此,在一次解决所有冲突和其他可能的死机之前,我想无条件激活硬件看门狗。为此,我已经分别或一起启用并测试了RuntimeWatchdogSec和ShutdownWatchdogSec。不幸的是,他们没有帮助。通过查看源代码,

我被困住了。我想问你的是找到一种方法:1 .至少从关闭开始时就无条件启用看门狗2.检测并以简便的方式解决所有冲突

第一种方案是优选的。


它在悬挂的途中吗?您可以与我们分享系统上启用了哪些服务吗?有定制的吗?您如何得出systemd挂起的结论?
MattBianco 2014年

@MattBianco我编辑了问题。有更多信息。
马丁

为什么在第一和第二个日志之间看不到任何相同的行?如果我能看到他们开始有所不同,我将能够提供更多帮助。
BenjiWiebe 2014年

@BenjiWiebe你是对的。我将再次编辑问题
马丁

尝试使用journalctl作为根目录,并在systemd日志中查找超时,故障和依赖项错误。
harrymc

Answers:


5

我冒险提出一个解决方案:尝试添加

  Before=basic.target

到/usr/lib/systemd/system/dbus.service。

在您的日志中,我被一个奇怪的事实所震惊,这使我想起了一段时间之前在Arch Linux论坛上读到的一次意外:该系统将在重新启动时挂起。该解决方案是如上所述提供的,其原因是,挂起是由于某些服务在停止后试图与d-bus进行通信而引起的:

因此,通过在basic.target之前对其进行排序,不仅可以在达到基本目标之前启动它,还可以确保它一直存在,直到在关闭过程中将basic.target降低之后。

在你的不健康的日志中,我们看到在事实基本制度没有停止,而这是在正确停止健康日志。

这是否行不通,并且考虑到您无法升级,您是否考虑过降级?


1
谢谢,我将尝试您的解决方案。我曾经考虑过替换旧的SysV,因为systemd似乎是设计错误。
马丁

在应用了更改之后,我在启动时从systemd获得了此信息:找到了订购周期,跳过了D-Bus系统消息总线。任何想法?
马丁

@Martin 1:您在单独的分区上有/和/ usr吗?2)/etc/init.d中有很多东西吗?还是在/etc/rc.d中?
MariusMatutiae 2014年

1
这个工程在Ubuntu 16.04巨大,该文件是/usr/lib/systemd/user/dbus.service根据[Unit]
安瓦尔

3

shutdown.target默认情况下会与所有其他单元发生冲突,以便在关机过程开始时自动停止它们。这也以另一种方式起作用–如果另一个单元启动,则会导致shutdown.target停止。因此,问题在于某些原因导致关机期间启动某些操作,从而覆盖了关机过程。

这应该已经在systemd v198中修复,这使关闭作业“不可替代”。


我无法升级:(
马丁

我必须发现conflits和解决这些问题
马丁

1

达到“目标关机”时,交换可能仍处于活动状态;我的解决方案是在重新启动之前强制取消激活交换:

swapoff -a
swapoff /dev/md6

之后,重启对我来说没有任何暂停。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.