有几次我突然断电导致ZFS池在完全重启系统之后无法使用。我打算买一台UPS以避免将来出现问题,但似乎应该有办法在没有完全关闭系统的情况下纠正这样一个简单的问题。
重现问题非常简单:我的ZFS池运行两个通过USB连接的硬盘驱动器。这是池正常运行时的状态:
$ sudo zpool status
pool: tank
state: ONLINE
scan: scrub repaired 0 in 1h36m with 0 errors on Sun Dec 11 02:00:22 2016
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
usb-ST4000DM_000-1F2168_000000000000-0:0-part1 ONLINE 0 0 0
usb-ST3000DM_001-1E6166_000000000000-0:1-part1 ONLINE 0 0 0
如果我在没有先停止ZFS的情况下关闭USB驱动器的电源,然后在几秒钟后再次打开电源,则会出现以下问题: 如果我在ZFS挂载点内尝试LS,它将无限期挂起,要求我关闭终端。 (ls进程将保持僵尸状态)。通过samba连接到nfs服务器的任何计算机在尝试访问共享目录时也会挂起。
现在状态如下:
$ sudo zpool status
pool: tank
state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://zfsonlinux.org/msg/ZFS-8000-HC
scan: scrub repaired 0 in 1h36m with 0 errors on Sun Dec 11 02:00:22 2016
config:
NAME STATE READ WRITE CKSUM
tank UNAVAIL 0 0 0 insufficient replicas
mirror-0 UNAVAIL 0 0 0 insufficient replicas
usb-ST4000DM_000-1F2168_000000000000-0:0-part1 UNAVAIL 0 0 0
usb-ST3000DM_001-1E6166_000000000000-0:1-part1 UNAVAIL 0 0 0
尽管USB驱动器已经再次打开。
我已尝试以下命令来解决此问题。
$ sudo zpool clear tank
cannot clear errors for tank: I/O error
$ sudo zfs unmount tank
cannot open 'tank': pool I/O is currently suspended
# Note: Because other computers were trying to access the zfs share via samba, there are zombie processes, which is why an export won't work.
$ sudo zpool export tank
umount: /tank: target is busy
(In some cases useful info about processes that
use the device is found by lsof(8) or fuser(1).)
cannot unmount '/tank': umount failed
$ sudo zpool export -f tank
umount: /tank: target is busy
(In some cases useful info about processes that
use the device is found by lsof(8) or fuser(1).)
cannot unmount '/tank': umount failed
# Tried this just for kicks, and got the expected result.
$ sudo zpool import -nfF tank
cannot import 'tank': a pool with that name already exists
use the form 'zpool import <pool | id> <newpool>' to give it a new name
我花了几个小时阅读人们的类似帖子,但似乎没有人 解决这个问题。如果我重启计算机运行ZFS所有错误 将消失,死亡过程被清除,一切恢复正常。
但必须有一个更清洁的方法来解决这个问题。有什么建议?
编辑: 我应该澄清一下。连接驱动器的服务器是重新使用的笔记本电脑,因此它有一个内部电源。因此,在正常运行的情况下,如果断电,USB驱动器可能会断电,然后再次出现,而服务器/笔记本电脑不会重新启动。