为什么Linux上的ZFS无法在AWS i2.8xlarge实例上充分利用8x SSD？

我对ZFS完全陌生，因此首先我想对它做一些简单的基准测试以了解它的行为。我想突破其性能极限，因此我预配了一个Amazon EC2 i2.8xlarge实例（几乎每小时$ 7，时间真是钱！）。该实例具有8个800GB SSD。

我fio对SSD本身进行了测试，并得到以下输出（已修剪）：

$ sudo fio --name randwrite --ioengine=libaio --iodepth=2 --rw=randwrite --bs=4k --size=400G --numjobs=8 --runtime=300 --group_reporting --direct=1 --filename=/dev/xvdb
[trimmed]
  write: io=67178MB, bw=229299KB/s, iops=57324, runt=300004msec
[trimmed]

57K IOPS，可进行4K随机写入。可敬。

然后，我创建了一个跨所有8个的ZFS卷。起初，我有一个raidz1vdev，其中装有所有8个SSD，但是我读到了这对性能不利的原因，所以最终得到了四个mirrorvdev，如下所示：

$ sudo zpool create testpool mirror xvdb xvdc mirror xvdd xvde mirror xvdf xvdg mirror xvdh xvdi
$ sudo zpool list -v
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testpool  2.91T   284K  2.91T         -     0%     0%  1.00x  ONLINE  -
  mirror   744G   112K   744G         -     0%     0%
    xvdb      -      -      -         -      -      -
    xvdc      -      -      -         -      -      -
  mirror   744G    60K   744G         -     0%     0%
    xvdd      -      -      -         -      -      -
    xvde      -      -      -         -      -      -
  mirror   744G      0   744G         -     0%     0%
    xvdf      -      -      -         -      -      -
    xvdg      -      -      -         -      -      -
  mirror   744G   112K   744G         -     0%     0%
    xvdh      -      -      -         -      -      -
    xvdi      -      -      -         -      -      -

我将记录大小设置为4K并运行了测试：

$ sudo zfs set recordsize=4k testpool
$ sudo fio --name randwrite --ioengine=libaio --iodepth=2 --rw=randwrite --bs=4k --size=400G --numjobs=8 --runtime=300 --group_reporting --filename=/testpool/testfile --fallocate=none
[trimmed]
  write: io=61500MB, bw=209919KB/s, iops=52479, runt=300001msec
    slat (usec): min=13, max=155081, avg=145.24, stdev=901.21
    clat (usec): min=3, max=155089, avg=154.37, stdev=930.54
     lat (usec): min=35, max=155149, avg=300.91, stdev=1333.81
[trimmed]

我在此ZFS池上仅获得52K IOPS。这实际上比一个SSD本身稍差。

我不明白我在做什么错。我是否正确配置了ZFS，或者这是对ZFS性能的不良测试？

注意尽管我已经升级到4.4.5 elrepo内核，但我使用的是官方的64位CentOS 7 HVM映像：

$ uname -a
Linux ip-172-31-43-196.ec2.internal 4.4.5-1.el7.elrepo.x86_64 #1 SMP Thu Mar 10 11:45:51 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

我从此处列出的zfs存储库中安装了ZFS 。我有该zfs软件包的0.6.5.5版本。

UPDATE：每@ ewwhite的建议，我想ashift=12和ashift=13：

$ sudo zpool create testpool mirror xvdb xvdc mirror xvdd xvde mirror xvdf xvdg mirror xvdh xvdi -o ashift=12 -f

和

$ sudo zpool create testpool mirror xvdb xvdc mirror xvdd xvde mirror xvdf xvdg mirror xvdh xvdi -o ashift=13 -f

这些都没有任何不同。据我了解，最新的ZFS位足以识别4K SSD并使用合理的默认值。

我确实注意到了CPU使用率的增长。@Tim提出了这个建议，但是我将其驳回了，但是我认为我没有足够长的时间来观察CPU来发现它。此实例上有大约30个CPU内核，CPU使用率高达80％。饥饿的过程？ z_wr_iss，很多实例。

我确认压缩已关闭，因此不是压缩引擎。

我没有使用raidz，所以它不应该是奇偶校验计算。

我做了一个perf top，它显示了_raw_spin_unlock_irqrestorein z_wr_int_4和osq_lockin中花费的大部分内核时间z_wr_iss。

我现在相信有一个CPU组件可以解决此性能瓶颈，尽管我还无法确定它可能是什么。

更新2：根据@ewwhite和其他人的建议，正是这种环境的虚拟化性质造成了性能不确定性，我曾经fio对基准测试环境中分布在四个SSD上的随机4K写入进行了基准测试。每个SSD本身可提供约55K IOPS，因此我预计其中四个将有约240K IO。那或多或少是我得到的：

$ sudo fio --name randwrite --ioengine=libaio --iodepth=8 --rw=randwrite --bs=4k --size=398G --numjobs=8 --runtime=300 --group_reporting --filename=/dev/xvdb:/dev/xvdc:/dev/xvdd:/dev/xvde
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8
...
randwrite: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=8
fio-2.1.5
Starting 8 processes
[trimmed]
  write: io=288550MB, bw=984860KB/s, iops=246215, runt=300017msec
    slat (usec): min=1, max=24609, avg=30.27, stdev=566.55
    clat (usec): min=3, max=2443.8K, avg=227.05, stdev=1834.40
     lat (usec): min=27, max=2443.8K, avg=257.62, stdev=1917.54
[trimmed]

这清楚地表明，尽管可能是虚拟化的环境，但可以维持的IOPS比我所看到的要高得多。关于ZFS实施方式的某些问题，使其无法达到最高速度。我只是不知道那是什么。

amazon-web-services zfs zfsonlinux

— 安尼尔森
source

您正在使用EC2。您只能获得Amazon想要给您的IOPS。

— 迈克尔·汉普顿

亚马逊给我每个实例附加的SSD约5.2K IOPS，并且有八个此类SSD。从Amazon文档中可以清楚地看到，此大小的实例是该实例所在的物理主机上运行的唯一实例。此外，这些是本地SSD，而不是EBS卷，因此没有其他工作负载会争用IO带宽。那并不能说明我所看到的性能。

— anelson

这会给CPU造成负担还是会达到内存限制？

— 蒂姆（Tim）

您阅读过这一系列文章吗？hatim.eu/2014/05/24/… 还有其他文章有帮助吗？

— 蒂姆，

为了排除zfsonlinux的实际实现缺陷，我将尝试在同一实例上安装Solaris 11进行相同的基准测试。

— the-wabbit's

Answers:

此设置可能无法很好地调整。使用SSD时，/ etc / modprobe / zfs.conf文件和ashift值都需要参数

尝试ashift = 12或13，然后再次测试。

编辑：

这仍然是一个虚拟化的解决方案，因此我们对底层硬件或所有事物如何互连了解得不多。我不知道您会从该解决方案中获得更好的性能。

编辑：

我想我看不到尝试以这种方式优化云实例的意义。因为如果要以最高性能为目标，那么您将使用硬件，对吗？

但是请记住，ZFS有很多可调的设置，并且默认情况下您得到的不是您的用例。

请尝试以下操作，/etc/modprobe.d/zfs.conf然后重新启动。这就是我在所有SSD数据池中用于应用程序服务器的内容。您的班次应为12或13。基准为Compression = off，但在生产中使用compression = lz4。设置atime = off。我将记录大小保留为默认值（128K）。

options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
options zfs zfs_vdev_sync_write_min_active=64
options zfs zfs_vdev_sync_write_max_active=128
options zfs zfs_vdev_sync_read_min_active=64
options zfs zfs_vdev_sync_read_max_active=128
options zfs zfs_vdev_async_read_min_active=64
options zfs zfs_vdev_async_read_max_active=128
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=64
options zfs zfs_prefetch_disable=1

— 怀特
source

很棒的建议。我更详细地更新了原始问题。简介：ashift并没有帮助，我认为这个问题与CPU使用率有关。

— anelson

您正在使用压缩还是重复数据删除？

— ewwhite

不，我确认压缩功能已关闭zfs get compression。重复数据删除功能也已关闭。

— anelson

这是一个公平的观点，但是我可以证明底层虚拟存储设备的性能要好得多。请参阅更新2。

— anelson

@anelson好的。试试上面的设置。

— ewwhite

您似乎正在等待Linux内核互斥锁，而后者又可能正在Xen环形缓冲区上等待。我无法确定是否无法使用类似的机器，但是我对每小时向亚马逊支付7美元的特权并不感兴趣。

更长的文章在这里：https : //www.reddit.com/r/zfs/comments/4b4r1y/why_is_zfs_on_linux_unable_to_fully_utilize_8x/d1e91wo ; 我希望它在一个地方而不是两个地方。

— 马修·巴恩森
source

我花了很多时间试图找出答案。我的特定挑战：Postgres服务器，我想使用ZFS来处理其数据量。基线是XFS。

首先，我的审判告诉我这ashift=12是错误的。如果有一个不可思议的ashift数字，它不是12。我正在使用，0并且效果非常好。

我还尝试了很多zfs选项，下面给出的结果是：

atime=off -我不需要访问时间

checksum=off -我正在分割，而不是镜像

compression=lz4- 压缩性能更好（CPU权衡？）

exec=off -这是数据，不是可执行文件

logbias=throughput -在互联网上阅读，这对Postgres更好

recordsize=8k -特定于PG的8k块大小

sync=standard-试图关闭同步；没有看到太大的好处

我的以下测试显示出比XFS更好的性能（如果您发现我的测试有误，请发表评论！）。

我的下一步是尝试在2 x EBS ZFS文件系统上运行Postgres。

我的特定设置：

EC2：m4.xlarge实例

EBS：250GB gp2卷

内核：Linux [...] 3.13.0-105-通用＃152-Ubuntu SMP [...] x86_64 x86_64 x86_64 GNU / Linux *

首先，我想测试原始EBS性能。使用fio上面命令的变体，我想到了下面的咒语。注意：我使用的是8k块，因为这是我读过的PostgreSQL写的：

ubuntu@ip-172-31-30-233:~$ device=/dev/xvdbd; sudo dd if=/dev/zero of=${device} bs=1M count=100 && sudo fio --name randwrite --ioengine=libaio --iodepth=4 --rw=randwrite --bs=8k --size=400G --numjobs=4 --runtime=60 --group_reporting --fallocate=none --filename=${device}
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.250631 s, 418 MB/s
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
...
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
fio-2.1.3
Starting 4 processes
Jobs: 4 (f=4): [wwww] [100.0% done] [0KB/13552KB/0KB /s] [0/1694/0 iops] [eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=18109: Tue Feb 14 19:13:53 2017
  write: io=3192.2MB, bw=54184KB/s, iops=6773, runt= 60327msec
    slat (usec): min=2, max=805209, avg=585.73, stdev=6238.19
    clat (usec): min=4, max=805236, avg=1763.29, stdev=10716.41
     lat (usec): min=15, max=805241, avg=2349.30, stdev=12321.43
    clat percentiles (usec):
     |  1.00th=[   15],  5.00th=[   16], 10.00th=[   17], 20.00th=[   19],
     | 30.00th=[   23], 40.00th=[   24], 50.00th=[   25], 60.00th=[   26],
     | 70.00th=[   27], 80.00th=[   29], 90.00th=[   36], 95.00th=[15808],
     | 99.00th=[31872], 99.50th=[35584], 99.90th=[99840], 99.95th=[199680],
     | 99.99th=[399360]
    bw (KB  /s): min=  156, max=1025440, per=26.00%, avg=14088.05, stdev=67584.25
    lat (usec) : 10=0.01%, 20=20.53%, 50=72.20%, 100=0.86%, 250=0.17%
    lat (usec) : 500=0.13%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 4=0.01%, 10=0.59%, 20=2.01%, 50=3.29%
    lat (msec) : 100=0.11%, 250=0.05%, 500=0.02%, 750=0.01%, 1000=0.01%
  cpu          : usr=0.22%, sys=1.34%, ctx=9832, majf=0, minf=114
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=408595/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=3192.2MB, aggrb=54184KB/s, minb=54184KB/s, maxb=54184KB/s, mint=60327msec, maxt=60327msec

Disk stats (read/write):
  xvdbd: ios=170/187241, merge=0/190688, ticks=180/8586692, in_queue=8590296, util=99.51%

EBS量的原始性能为WRITE: io=3192.2MB。

现在，使用相同的fio命令测试XFS ：

Jobs: 4 (f=4): [wwww] [100.0% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=17441: Tue Feb 14 19:10:27 2017
  write: io=3181.9MB, bw=54282KB/s, iops=6785, runt= 60024msec
    slat (usec): min=3, max=21077K, avg=587.19, stdev=76081.88
    clat (usec): min=4, max=21077K, avg=1768.72, stdev=131857.04
     lat (usec): min=23, max=21077K, avg=2356.23, stdev=152444.62
    clat percentiles (usec):
     |  1.00th=[   29],  5.00th=[   40], 10.00th=[   46], 20.00th=[   52],
     | 30.00th=[   56], 40.00th=[   59], 50.00th=[   63], 60.00th=[   69],
     | 70.00th=[   79], 80.00th=[   99], 90.00th=[  137], 95.00th=[  274],
     | 99.00th=[17024], 99.50th=[25472], 99.90th=[70144], 99.95th=[120320],
     | 99.99th=[1564672]
    bw (KB  /s): min=    2, max=239872, per=66.72%, avg=36217.04, stdev=51480.84
    lat (usec) : 10=0.01%, 20=0.03%, 50=15.58%, 100=64.51%, 250=14.55%
    lat (usec) : 500=1.36%, 750=0.33%, 1000=0.25%
    lat (msec) : 2=0.68%, 4=0.67%, 10=0.71%, 20=0.58%, 50=0.59%
    lat (msec) : 100=0.10%, 250=0.02%, 500=0.01%, 750=0.01%, 1000=0.01%
    lat (msec) : 2000=0.01%, >=2000=0.01%
  cpu          : usr=0.43%, sys=4.81%, ctx=269518, majf=0, minf=110
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=407278/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=3181.9MB, aggrb=54282KB/s, minb=54282KB/s, maxb=54282KB/s, mint=60024msec, maxt=60024msec

Disk stats (read/write):
  xvdbd: ios=4/50983, merge=0/319694, ticks=0/2067760, in_queue=2069888, util=26.21%

我们的基准是WRITE: io=3181.9MB; 确实接近原始磁盘速度。

现在，以ZFS WRITE: io=3181.9MB作为参考：

ubuntu@ip-172-31-30-233:~$ sudo zpool create testpool xvdbd -f && (for option in atime=off checksum=off compression=lz4 exec=off logbias=throughput recordsize=8k sync=standard; do sudo zfs set $option testpool; done;) && sudo fio --name randwrite --ioengine=libaio --iodepth=4 --rw=randwrite --bs=8k --size=400G --numjobs=4 --runtime=60 --group_reporting --fallocate=none --filename=/testpool/testfile; sudo zpool destroy testpool
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
...
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
fio-2.1.3
Starting 4 processes
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
Jobs: 4 (f=4): [wwww] [100.0% done] [0KB/41328KB/0KB /s] [0/5166/0 iops] [eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=18923: Tue Feb 14 19:17:18 2017
  write: io=4191.7MB, bw=71536KB/s, iops=8941, runt= 60001msec
    slat (usec): min=10, max=1399.9K, avg=442.26, stdev=4482.85
    clat (usec): min=2, max=1400.4K, avg=1343.38, stdev=7805.37
     lat (usec): min=56, max=1400.4K, avg=1786.61, stdev=9044.27
    clat percentiles (usec):
     |  1.00th=[   62],  5.00th=[   75], 10.00th=[   87], 20.00th=[  108],
     | 30.00th=[  122], 40.00th=[  167], 50.00th=[  620], 60.00th=[ 1176],
     | 70.00th=[ 1496], 80.00th=[ 2320], 90.00th=[ 2992], 95.00th=[ 4128],
     | 99.00th=[ 6816], 99.50th=[ 9536], 99.90th=[30592], 99.95th=[66048],
     | 99.99th=[185344]
    bw (KB  /s): min= 2332, max=82848, per=25.46%, avg=18211.64, stdev=15010.61
    lat (usec) : 4=0.01%, 50=0.09%, 100=14.60%, 250=26.77%, 500=5.96%
    lat (usec) : 750=5.27%, 1000=4.24%
    lat (msec) : 2=20.96%, 4=16.74%, 10=4.93%, 20=0.30%, 50=0.08%
    lat (msec) : 100=0.04%, 250=0.03%, 500=0.01%, 2000=0.01%
  cpu          : usr=0.61%, sys=9.48%, ctx=177901, majf=0, minf=107
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=536527/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=4191.7MB, aggrb=71535KB/s, minb=71535KB/s, maxb=71535KB/s, mint=60001msec, maxt=60001msec

注意，这比XFS表现更好WRITE: io=4191.7MB。我很确定这是由于压缩造成的。

对于踢球，我将添加第二卷：

ubuntu@ip-172-31-30-233:~$ sudo zpool create testpool xvdb{c,d} -f && (for option in atime=off checksum=off compression=lz4 exec=off logbias=throughput recordsize=8k sync=standard; do sudo zfs set $option testpool; done;) && sudo fio --name randwrite --ioengine=libaio --iodepth=4 --rw=randwrite --bs=8k --size=400G --numjobs=4 --runtime=60 --group_reporting --fallocate=none --filename=/testpool/testfile; sudo zpool destroy testpool
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
...
randwrite: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=4
fio-2.1.3
Starting 4 processes
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
randwrite: Laying out IO file(s) (1 file(s) / 409600MB)
Jobs: 4 (f=4): [wwww] [100.0% done] [0KB/71936KB/0KB /s] [0/8992/0 iops] [eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=20901: Tue Feb 14 19:23:30 2017
  write: io=5975.9MB, bw=101983KB/s, iops=12747, runt= 60003msec
    slat (usec): min=10, max=1831.2K, avg=308.61, stdev=4419.95
    clat (usec): min=3, max=1831.6K, avg=942.64, stdev=7696.18
     lat (usec): min=58, max=1831.8K, avg=1252.25, stdev=8896.67
    clat percentiles (usec):
     |  1.00th=[   70],  5.00th=[   92], 10.00th=[  106], 20.00th=[  129],
     | 30.00th=[  386], 40.00th=[  490], 50.00th=[  692], 60.00th=[  796],
     | 70.00th=[  932], 80.00th=[ 1160], 90.00th=[ 1624], 95.00th=[ 2256],
     | 99.00th=[ 5344], 99.50th=[ 8512], 99.90th=[30592], 99.95th=[60672],
     | 99.99th=[117248]
    bw (KB  /s): min=   52, max=112576, per=25.61%, avg=26116.98, stdev=15313.32
    lat (usec) : 4=0.01%, 10=0.01%, 50=0.04%, 100=7.17%, 250=19.04%
    lat (usec) : 500=14.36%, 750=15.36%, 1000=17.41%
    lat (msec) : 2=20.28%, 4=4.82%, 10=1.13%, 20=0.25%, 50=0.08%
    lat (msec) : 100=0.04%, 250=0.02%, 2000=0.01%
  cpu          : usr=1.05%, sys=15.14%, ctx=396649, majf=0, minf=103
  IO depths    : 1=0.1%, 2=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=764909/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=5975.9MB, aggrb=101982KB/s, minb=101982KB/s, maxb=101982KB/s, mint=60003msec, maxt=60003msec

在第二卷中，WRITE: io=5975.9MB写入约为1.8倍。

第三卷给了我们WRITE: io=6667.5MB，所以写入量约为2.1倍。

第四卷给我们WRITE: io=6552.9MB。对于这种实例类型，看起来我几乎将EBS网络的容量限制为两个，肯定是三个容量，而使用4个容量（750 * 3 = 2250 IOPS）也没有更好。

*从此视频中，请确保您使用3.8+ linux内核来获得所有EBS优势。

— 贝尔托
source

有趣的结果。请注意，我认为您感到困惑WRITE: io=；这不是速度，而是那段时间写入的数据量。仅与具有固定运行时间的测试的速度相关，但与其他基准测试的一致性最好将重点放在IOPS上iops=。在您的情况下，结果是相似的如果使用预配置的IOPS EBS卷和更大的实例，则可能会好得多。有关实例大小的预期EBS上限，请参见此页面。请注意，如果您不小心，EBS费用会迅速增加！

— anelson

很好的反馈，谢谢@anelson！看了预配置的iops，它们非常昂贵。但是，我正在考虑为写入ZIL的日志卷创建一个小的预配置iops卷，并获得一些性能上的好处。我在某个地方读到的ZIL不会超过内存中的大小，并且我将其限制为2G /etc/modules.d/zfs.conf。下一个问题是给定ec2实例的iops的适当数量是多少。查看您所引用的页面仍然很棘手，而且我所看到的性能还没有达到文档说明的水平。

— berto