缓慢的ssh登录-org.freedesktop.login1的激活超时


39

在我的一台服务器上,我注意到SSH登录确实很延迟。

使用ssh -vvv选项连接时,延迟发生在 debug1: Entering interactive session.

连接摘录:

debug1: Authentication succeeded (publickey).
Authenticated to IP_REDACTED ([IP_REDACTED]:22).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug2: client_session2_setup: id 0
debug2: channel 0: request pty-req confirm 1

使用此处概述的方法我生成了strace输出并注意到该行14:09:53.676004 ppoll([{fd=5, events=POLLIN}], 1, {24, 999645000}, NULL, 8) = 1 ([{fd=5, revents=POLLIN}], left {0, 0}) <25.020764>需要25秒。

strace输出的提取:

14:09:53.675567 clock_gettime(CLOCK_MONOTONIC, {4662549, 999741404}) = 0 <0.000024>
14:09:53.675651 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1\n\0\0\0\2\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", 24}], msg_controll
en=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24 <0.000024>
14:09:53.675744 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0"..., 146}], msg_controllen
=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 146 <0.000025>
14:09:53.675842 recvmsg(5, 0x7ffe0ff1dfa0, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailab
le) <0.000023>
14:09:53.675925 clock_gettime(CLOCK_MONOTONIC, {4662550, 96075}) = 0 <0.000024>
14:09:53.676004 ppoll([{fd=5, events=POLLIN}], 1, {24, 999645000}, NULL, 8) = 1 ([{fd=5, revents=POLLIN}], left {0, 0}) <25.020764>
14:10:18.696865 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\3\1\0013\0\0\0\3\0\0\0m\0\0\0\6\1s\0\5\0\0\0", 24}], msg_controllen=0,     msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24 <0.000017>
14:10:18.696944 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{":1.10\0\0\0\4\1s\0#\0\0\0org.freedesktop."..., 155}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 155 <0.000018>

我注意到相关时间的身份验证日志中有一个条目:

Jul 21 14:10:18 click sshd[8165]: pam_systemd(sshd:session): Failed to create session: Activation of org.freedesktop.login1 timed out

对此了解不足,它试图轮询什么,为什么现在在此特定服务器上花费25秒呢。

journalctl -u systemd-logind命令显示

Jul 20 11:33:06 click systemd-logind[19415]: Failed to abandon session scope: Transport endpoint is not connected
Jul 21 05:04:54 myhost systemd[1]: Started Login Service.
Jul 21 12:15:30 myhost systemd[1]: Started Login Service.
Jul 21 12:17:04 myhost systemd[1]: Started Login Service.
Jul 21 12:49:55 myhost systemd[1]: Started Login Service.
Jul 21 13:57:05 myhost systemd[1]: Started Login Service.
Jul 21 13:58:49 myhost systemd[1]: Started Login Service.
Jul 21 14:01:55 myhost systemd[1]: Started Login Service.
Jul 21 14:08:32 myhost systemd[1]: Started Login Service.
Jul 21 14:09:53 myhost systemd[1]: Started Login Service.
Jul 21 14:19:08 myhost systemd[1]: Started Login Service.
Jul 21 14:21:26 myhost systemd[1]: Started Login Service.
Jul 21 14:22:37 myhost systemd[1]: Started Login Service.
Jul 21 14:25:20 myhost systemd[1]: Started Login Service.
Jul 21 14:30:27 myhost systemd[1]: Started Login Service.
Jul 21 15:02:56 myhost systemd[1]: Started Login Service.

发出命令可以systemctl restart systemd-logind.service修复该问题(目前可能如此)。

什么是Activation of org.freedesktop.login1它提到?有什么办法可以防止以后再次启动登录?我希望随着时间的流逝,我管理的其余服务器都会遇到这个问题。

只是注意到这开始在另一台服务器上发生。

$ sudo service systemd-logind status

● systemd-logind.service - Login Service
   Loaded: loaded (/lib/systemd/system/systemd-logind.service; static)
   Active: active (running) since Tue 2015-06-16 14:10:57 BST; 1 months 12 days ago
     Docs: man:systemd-logind.service(8)
           man:logind.conf(5)
           http://www.freedesktop.org/wiki/Software/systemd/logind
           http://www.freedesktop.org/wiki/Software/systemd/multiseat
 Main PID: 1701 (systemd-logind)
   Status: "Processing requests..."
   CGroup: /system.slice/systemd-logind.service
           └─1701 /lib/systemd/systemd-logind

Jul 28 13:16:21 myhost systemd[1]: Started Login Service.
Jul 28 13:16:47 myhost systemd[1]: Started Login Service.
Jul 28 16:09:23 myhost systemd[1]: Started Login Service.
Jul 28 16:09:49 myhost systemd[1]: Started Login Service.
Jul 28 16:10:15 myhost systemd[1]: Started Login Service.
Jul 28 16:10:41 myhost systemd[1]: Started Login Service.
Jul 28 22:50:19 myhost systemd[1]: Started Login Service.
Jul 29 05:00:15 myhost systemd[1]: Started Login Service.
Jul 29 11:00:20 myhost systemd[1]: Started Login Service.
Jul 29 11:09:56 myhost systemd[1]: Started Login Service.

编辑-扩展journalctl输出。

EDIT2-当注意到在另一台服务器上启动时,如注释中所建议的那样添加了systemd-logind状态。

更新-我的其余Jessie服务器都开始发生这种情况。我是唯一经历过这种情况的人吗?除了重新启动systemd-logind之外,还必须有其他解决方法,有人有什么想法吗?

770135上有Debian错误报告。


systemcts status systemd-logind在重新启动之前查看的输出以查看其出了什么问题(退出,失败等),将很有用。ppoll只是一个调解员,正在等待systemd的响应,因此您不能责怪它。
2015年

没有systemcts命令
Alasdair

抱歉。systemctl当然
Jakuje

我以为那是您的意思,但想确定。这与命令提供的输出是否不同journalctl -u systemd-logind
Alasdair

它应该显示日志,还应该显示服务本身的状态。
2015年

Answers:



1

使用:

systemctl restart systemd-logind

仅暂时解决问题。

一种解决方法是删除所有.scope从cron作业文件,说明在这里

* 2,14 * * * root /bin/rm -f /run/systemd/system/*.scope

相关的systemd错误报告在此处:作用域单元泄漏会减慢“ systemctl list-unit-files”的速度并延迟登录

看来这实际上是dbus的错误:unix fd的运行中计数中断,这在dbus 1.11.10版中解决

要永久修复此错误,您只需要等待此版本的dbus出现在您的发行版中即可。目前,Debian Stretch在dbus 1.10.18上,Ubuntu 17.04(Zesty)在1.10.10上,CentOS 7在dbus 1.6.12上。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.