我正在尝试使用Linux在某些Sun硬件上优化存储设置。任何想法将不胜感激。
我们有以下硬件:
- 太阳之刃X6270
- 2个LSISAS1068E SAS控制器
- 2个带1 TB磁盘的Sun J4400 JBOD(每个JBOD 24个磁盘)
- Fedora Core 12
- FC13的2.6.33发行版内核(也尝试使用FC12的最新2.6.31内核,结果相同)
这是SAS硬件的数据表:
http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf
它使用8个通道的PCI Express 1.0a。每个通道的带宽为250 MB /秒,每个SAS控制器应该能够做到2000 MB /秒。
每个控制器每个端口可以执行3 Gb /秒的速度,并具有两个4端口PHY。我们将两个PHY从控制器连接到JBOD。因此,在JBOD和控制器之间,我们有2个PHY * 4个SAS端口* 3 Gb /秒= 24 Gb /秒的带宽,这比PCI Express带宽还大。
启用写缓存并进行大写操作时,每个磁盘可以维持大约80 MB /秒的速度(接近磁盘开始位置)。如果使用24个磁盘,则意味着每个JBOD我们应该能够实现1920 MB /秒的速度。
多路径{ rr_min_io 100 uid 0 path_grouping_policy多总线 故障回复手册 path_selector“循环0” rr_weight优先级 别名somealias no_path_retry队列 模式0644 吉德0 某某 }
我为rr_min_io尝试了50、100、1000的值,但这似乎没有太大的区别。
随着rr_min_io的变化,我尝试在启动dd之间添加一些延迟,以防止它们同时在同一PHY上进行写入,但这没有任何区别,因此,我认为I / O正在适当地扩展。
根据/ proc / interrupts,SAS控制器正在使用“ IR-IO-APIC-fasteoi”中断方案。由于某些原因,只有机器中的核心#0正在处理这些中断。我可以通过分配一个单独的内核来处理每个SAS控制器的中断来稍微提高性能:
回声2> / proc / irq / 24 / smp_affinity 回声4> / proc / irq / 26 / smp_affinity
使用dd写入磁盘会生成“函数调用中断”(不知道它们是什么),这些中断由内核4处理,因此我也将其他进程置于该内核之外。
我运行48 dd(每个磁盘一个),将它们分配给不处理中断的内核,如下所示:
任务集-c somecore dd if = / dev / zero of = / dev / mapper / mpathx oflag = direct bs = 128M
oflag = direct可防止涉及任何类型的缓冲区高速缓存。
我的核心似乎都没有被用尽。处理中断的内核大多处于空闲状态,所有其他内核都在等待I / O,正如人们所期望的那样。
Cpu0:0.0%us,1.0%sy,0.0%ni,91.2%id,7.5%wa,0.0%hi,0.2%si,0.0%st Cpu1:0.0%us,0.8%sy,0.0%ni,93.0%id,0.2%wa,0.0%hi,6.0%si,0.0%st Cpu2:0.0%us,0.6%sy,0.0%ni,94.4%id,0.1%wa,0.0%hi,4.8%si,0.0%st Cpu3:0.0%us,7.5%sy,0.0%ni,36.3%id,56.1%wa,0.0%hi,0.0%si,0.0%st Cpu4:0.0%us,1.3%sy,0.0%ni,85.7%id,4.9%wa,0.0%hi,8.1%si,0.0%st Cpu5:0.1%us,5.5%sy,0.0%ni,36.2%id,58.3%wa,0.0%hi,0.0%si,0.0%st Cpu6:0.0%us,5.0%sy,0.0%ni,36.3%id,58.7%wa,0.0%hi,0.0%si,0.0%st Cpu7:0.0%us,5.1%sy,0.0%ni,36.3%id,58.5%wa,0.0%hi,0.0%si,0.0%st Cpu8:0.1%us,8.3%sy,0.0%ni,27.2%id,64.4%wa,0.0%hi,0.0%si,0.0%st Cpu9:0.1%us,7.9%sy,0.0%ni,36.2%id,55.8%wa,0.0%hi,0.0%si,0.0%st Cpu10:0.0%us,7.8%sy,0.0%ni,36.2%id,56.0%wa,0.0%hi,0.0%si,0.0%st Cpu11:0.0%us,7.3%sy,0.0%ni,36.3%id,56.4%wa,0.0%hi,0.0%si,0.0%st Cpu12:0.0%us,5.6%sy,0.0%ni,33.1%id,61.2%wa,0.0%hi,0.0%si,0.0%st Cpu13:0.1%us,5.3%sy,0.0%ni,36.1%id,58.5%wa,0.0%hi,0.0%si,0.0%st Cpu14:0.0%us,4.9%sy,0.0%ni,36.4%id,58.7%wa,0.0%hi,0.0%si,0.0%st Cpu15:0.1%us,5.4%sy,0.0%ni,36.5%id,58.1%wa,0.0%hi,0.0%si,0.0%st
考虑到所有这些,运行“ dstat 10”报告的吞吐量在2200-2300 MB /秒的范围内。
鉴于上述数学原理,我期望范围为2 * 1920〜= 3600+ MB / sec。
有人知道我丢失的带宽去了哪里吗?
谢谢!