RAID性能突然变慢


16

我们最近注意到,数据库查询的运行时间比平时更长。经过一些调查,看来我们的磁盘读取速度非常慢。

过去,RAID控制器在BBU上启动重新学习周期并切换到直写,也导致了类似的问题。这次似乎不是这样。

我在bonnie++几天中跑了几次。结果如下:

bonnie ++的输出

22-82 M / s的读取速度似乎很糟糕。dd在原始设备上运行几分钟,显示的读取速度为15.8 MB / s至225 MB / s(请参阅下面的更新)。iotop并不表示有任何其他进程在争夺IO,因此我不确定读取速度为何如此变化。

该RAID卡是MegaRAID SAS 9280,在RAID10中具有12个SAS驱动器(15k,300GB),带有XFS文件系统(在RAID1中配置的两个SSD上的OS)。我没有看到任何SMART警报,并且阵列似乎没有降级。

我也已经跑步了,xfs_check并且似乎没有任何XFS一致性问题。

接下来的调查步骤应该是什么?

服务器规格

Ubuntu 12.04.5 LTS
128GB RAM
Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz

输出xfs_repair -n

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 3
        - agno = 2
        - agno = 0
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

输出megacli -AdpAllInfo -aAll

                    Versions
                ================
Product Name    : LSI MegaRAID SAS 9280-4i4e
Serial No       : SV24919344
FW Package Build: 12.12.0-0124

                    Mfg. Data
                ================
Mfg. Date       : 12/06/12
Rework Date     : 00/00/00
Revision No     : 04B
Battery FRU     : N/A

                Image Versions in Flash:
                ================
FW Version         : 2.130.363-1846
BIOS Version       : 3.25.00_4.12.05.00_0x05180000
Preboot CLI Version: 04.04-020:#%00009
WebBIOS Version    : 6.0-51-e_47-Rel
NVDATA Version     : 2.09.03-0039
Boot Block Version : 2.02.00.00-0000
BOOT Version       : 09.250.01.219

                Pending Images in Flash
                ================
None

                PCI Info
                ================
Controller Id   : 0000
Vendor Id       : 1000
Device Id       : 0079
SubVendorId     : 1000
SubDeviceId     : 9282

Host Interface  : PCIE

ChipRevision    : B4

Link Speed       : 0
Number of Frontend Port: 0
Device Interface  : PCIE

Number of Backend Port: 8
Port  :  Address
0        5003048001c1e47f
1        0000000000000000
2        0000000000000000
3        0000000000000000
4        0000000000000000
5        0000000000000000
6        0000000000000000
7        0000000000000000

                HW Configuration
                ================
SAS Address      : 500605b005a6cbc0
BBU              : Present
Alarm            : Present
NVRAM            : Present
Serial Debugger  : Present
Memory           : Present
Flash            : Present
Memory Size      : 512MB
TPM              : Absent
On board Expander: Absent
Upgrade Key      : Absent
Temperature sensor for ROC    : Absent
Temperature sensor for controller    : Absent


                Settings
                ================
Current Time                     : 14:58:51 7/11, 2016
Predictive Fail Poll Interval    : 300sec
Interrupt Throttle Active Count  : 16
Interrupt Throttle Completion    : 50us
Rebuild Rate                     : 30%
PR Rate                          : 30%
BGI Rate                         : 30%
Check Consistency Rate           : 30%
Reconstruction Rate              : 30%
Cache Flush Interval             : 4s
Max Drives to Spinup at One Time : 4
Delay Among Spinup Groups        : 2s
Physical Drive Coercion Mode     : Disabled
Cluster Mode                     : Disabled
Alarm                            : Enabled
Auto Rebuild                     : Enabled
Battery Warning                  : Enabled
Ecc Bucket Size                  : 15
Ecc Bucket Leak Rate             : 1440 Minutes
Restore HotSpare on Insertion    : Disabled
Expose Enclosure Devices         : Enabled
Maintain PD Fail History         : Enabled
Host Request Reordering          : Enabled
Auto Detect BackPlane Enabled    : SGPIO/i2c SEP
Load Balance Mode                : Auto
Use FDE Only                     : No
Security Key Assigned            : No
Security Key Failed              : No
Security Key Not Backedup        : No
Default LD PowerSave Policy      : Controller Defined
Maximum number of direct attached drives to spin up in 1 min : 120
Auto Enhanced Import             : No
Any Offline VD Cache Preserved   : No
Allow Boot with Preserved Cache  : No
Disable Online Controller Reset  : No
PFK in NVRAM                     : No
Use disk activity for locate     : No
POST delay           : 90 seconds
BIOS Error Handling              : Stop On Errors
Current Boot Mode         :Normal
                Capabilities
                ================
RAID Level Supported             : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span
Supported Drives                 : SAS, SATA

Allowed Mixing:

Mix in Enclosure Allowed
Mix of SAS/SATA of HDD type in VD Allowed

                Status
                ================
ECC Bucket Count                 : 0

                Limitations
                ================
Max Arms Per VD          : 32
Max Spans Per VD         : 8
Max Arrays               : 128
Max Number of VDs        : 64
Max Parallel Commands    : 1008
Max SGE Count            : 80
Max Data Transfer Size   : 8192 sectors
Max Strips PerIO         : 42
Max LD per array         : 16
Min Strip Size           : 8 KB
Max Strip Size           : 1.0 MB
Max Configurable CacheCade Size: 0 GB
Current Size of CacheCade      : 0 GB
Current Size of FW Cache       : 350 MB

                Device Present
                ================
Virtual Drives    : 2
  Degraded        : 0
  Offline         : 0
Physical Devices  : 16
  Disks           : 14
  Critical Disks  : 0
  Failed Disks    : 0

                Supported Adapter Operations
                ================
Rebuild Rate                    : Yes
CC Rate                         : Yes
BGI Rate                        : Yes
Reconstruct Rate                : Yes
Patrol Read Rate                : Yes
Alarm Control                   : Yes
Cluster Support                 : No
BBU                             : Yes
Spanning                        : Yes
Dedicated Hot Spare             : Yes
Revertible Hot Spares           : Yes
Foreign Config Import           : Yes
Self Diagnostic                 : Yes
Allow Mixed Redundancy on Array : No
Global Hot Spares               : Yes
Deny SCSI Passthrough           : No
Deny SMP Passthrough            : No
Deny STP Passthrough            : No
Support Security                : No
Snapshot Enabled                : No
Support the OCE without adding drives : Yes
Support PFK                     : Yes
Support PI                      : No
Support Boot Time PFK Change    : No
Disable Online PFK Change       : No
PFK TrailTime Remaining         : 0 days 0 hours
Support Shield State            : No
Block SSD Write Disk Cache Change: No

                Supported VD Operations
                ================
Read Policy          : Yes
Write Policy         : Yes
IO Policy            : Yes
Access Policy        : Yes
Disk Cache Policy    : Yes
Reconstruction       : Yes
Deny Locate          : No
Deny CC              : No
Allow Ctrl Encryption: No
Enable LDBBM         : No
Support Breakmirror  : No
Power Savings        : No

                Supported PD Operations
                ================
Force Online                            : Yes
Force Offline                           : Yes
Force Rebuild                           : Yes
Deny Force Failed                       : No
Deny Force Good/Bad                     : No
Deny Missing Replace                    : No
Deny Clear                              : No
Deny Locate                             : No
Support Temperature                     : Yes
NCQ                                     : No
Disable Copyback                        : No
Enable JBOD                             : No
Enable Copyback on SMART                : No
Enable Copyback to SSD on SMART Error   : Yes
Enable SSD Patrol Read                  : No
PR Correct Unconfigured Areas           : Yes
Enable Spin Down of UnConfigured Drives : Yes
Disable Spin Down of hot spares         : No
Spin Down time                          : 30
T10 Power State                         : No
                Error Counters
                ================
Memory Correctable Errors   : 0
Memory Uncorrectable Errors : 0

                Cluster Information
                ================
Cluster Permitted     : No
Cluster Active        : No

                Default Settings
                ================
Phy Polarity                     : 0
Phy PolaritySplit                : 0
Background Rate                  : 30
Strip Size                       : 256kB
Flush Time                       : 4 seconds
Write Policy                     : WB
Read Policy                      : Adaptive
Cache When BBU Bad               : Disabled
Cached IO                        : No
SMART Mode                       : Mode 6
Alarm Disable                    : Yes
Coercion Mode                    : None
ZCR Config                       : Unknown
Dirty LED Shows Drive Activity   : No
BIOS Continue on Error           : 0
Spin Down Mode                   : None
Allowed Device Type              : SAS/SATA Mix
Allow Mix in Enclosure           : Yes
Allow HDD SAS/SATA Mix in VD     : Yes
Allow SSD SAS/SATA Mix in VD     : No
Allow HDD/SSD Mix in VD          : No
Allow SATA in Cluster            : No
Max Chained Enclosures           : 16
Disable Ctrl-R                   : Yes
Enable Web BIOS                  : Yes
Direct PD Mapping                : No
BIOS Enumerate VDs               : Yes
Restore Hot Spare on Insertion   : No
Expose Enclosure Devices         : Yes
Maintain PD Fail History         : Yes
Disable Puncturing               : No
Zero Based Enclosure Enumeration : No
PreBoot CLI Enabled              : Yes
LED Show Drive Activity          : Yes
Cluster Disable                  : Yes
SAS Disable                      : No
Auto Detect BackPlane Enable     : SGPIO/i2c SEP
Use FDE Only                     : No
Enable Led Header                : No
Delay during POST                : 0
EnableCrashDump                  : No
Disable Online Controller Reset  : No
EnableLDBBM                      : No
Un-Certified Hard Disk Drives    : Allow
Treat Single span R1E as R10     : No
Max LD per array                 : 16
Power Saving option              : Don't Auto spin down Configured Drives
Max power savings option is  not allowed for LDs. Only T10 power conditions are to be used.
Default spin down time in minutes: 30
Enable JBOD                      : No
TTY Log In Flash                 : No
Auto Enhanced Import             : No
BreakMirror RAID Support         : No
Disable Join Mirror              : No
Enable Shield State              : No
Time taken to detect CME         : 60s

输出megacli -AdpBbuCmd -GetBbuSTatus -aAll

BBU status for Adapter: 0

BatteryType: iBBU
Voltage: 4068 mV
Current: 0 mA
Temperature: 30 C
Battery State: Optimal
BBU Firmware Status:

  Charging Status              : Charging
  Voltage                                 : OK
  Temperature                             : OK
  Learn Cycle Requested                   : No
  Learn Cycle Active                      : No
  Learn Cycle Status                      : OK
  Learn Cycle Timeout                     : No
  I2c Errors Detected                     : No
  Battery Pack Missing                    : No
  Battery Replacement required            : No
  Remaining Capacity Low                  : No
  Periodic Learn Required                 : No
  Transparent Learn                       : No
  No space to cache offload               : No
  Pack is about to fail & should be replaced : No
  Cache Offload premium feature required  : No
  Module microcode update required        : No


GasGuageStatus:
  Fully Discharged        : No
  Fully Charged           : No
  Discharging             : Yes
  Initialized             : Yes
  Remaining Time Alarm    : No
  Discharge Terminated    : No
  Over Temperature        : No
  Charging Terminated     : No
  Over Charged            : No
  Relative State of Charge: 88 %
  Charger System State: 49169
  Charger System Ctrl: 0
  Charging current: 512 mA
  Absolute state of charge: 87 %
  Max Error: 4 %

Exit Code: 0x00

输出megacli -LDInfo -Lall -aAll

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 111.281 GB
Sector Size         : 512
Mirror Data         : 111.281 GB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Is VD Cached: No


Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-1, Secondary-0, RAID Level Qualifier-0
Size                : 1.633 TB
Sector Size         : 512
Mirror Data         : 1.633 TB
State               : Optimal
Strip Size          : 256 KB
Number Of Drives per span:2
Span Depth          : 6
Default Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Current Cache Policy: WriteBack, ReadAhead, Direct, Write Cache OK if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Is VD Cached: No

更新:根据安德鲁的建议,我跑dd了几分钟,看原始磁盘读取的吞吐量是多少:

dd if=/dev/sdb of=/dev/null bs=256k
19701+0 records in
19700+0 records out
5164236800 bytes (5.2 GB) copied, 202.553 s, 25.5 MB/s

具有大量可变吞吐量的其他运行结果:

18706857984 bytes (19 GB) copied, 1181.51 s, 15.8 MB/s
20923023360 bytes (21 GB) copied, 388.137 s, 53.9 MB/s
21205876736 bytes (21 GB) copied, 55.5997 s, 381 MB/s
25391005696 bytes (25 GB) copied, 153.903 s, 165 MB/s

更新2:输出为megacli -PDlist -aallhttps : //gist.github.com/danpelota/3fca1e5f90a1f358c2d52a49bfb08ef0


3
尝试直接从磁盘设备读取。喜欢的东西dd if=/dev/sdb of=/dev/null bs=256k,看看你得到什么带宽。只要确保除非你想从备份中恢复文件系统(S)......你不写入磁盘设备
安德鲁汉勒

2
@Dan,我在megacli -PDlist -aall输出中查看了详细信息,但那里没有发现任何明显错误的地方,smartctl -a -d sat+megaraid,10 /dev/sdb仅是一个示例,检查SMART计数器是否值得一看,恕我直言,SMART警报对我很少起作用,请首先使用来找到驱动器的正确模块设置smartctl --scan,然后更换部分sat+megaraid,21休假/dev/sda不变,不应该反正改变任何东西。我在上面使用过这种方式ServeRAID M5015 SAS/SATA Controller,尽管看起来还是一样。这是一个示例:pastebin.com/WYb8Utxr
Michal Sokolowski

9
@Dan,对我来说,这种行为仍然与“预失败”磁盘有关。某些磁盘可能几乎总是恢复您的读取,这会减慢整个过程。SMART统计数据对于批准/拒绝我的理论至关重要。再说一次,对我而言,大型raid计数器毫无用处,直到磁盘完全失效(它完全无法启动的状态)之后,它才改变了,就供应商和规模而言,我使用了类似的SASes。
Michal Sokolowski

2
@MichalSokolowski:很有道理。得到了正确的设备编号并smartctl在各个磁盘上运行。看起来磁盘18的非中等错误计数巨大:gist.github.com/danpelota/83b54854aa5af2e351ed71af5c8ebbf5
danpelota

1
很高兴您钉了它。请随时这样做。无论如何,我都会投票赞成。声誉提升只是副作用。请记得回答问题。您完成了大部分工作。:)我已经阅读了您的SMART统计信息,但以目前的形式,它们也没有用。我想查看实际的SMART计数器。很奇怪,您smartctl没有像我一样显示它。当然有实际答案。
Michal Sokolowski

Answers:


5

正如Michal 在其评论中指出的那样,问题出 “预故障”磁盘上。有从的MegaRAID控制器与smartctl读取的诊断没有红旗SMART Health Status:OK,但运行smartctl在每个磁盘上透露了一个巨大的非介质错误计数(我通过每个磁盘ID写了一个快速bash脚本循环)。这是完整输出中的相关位:

# Ran this for each individual disk on the /dev/sdb array:
smartctl -a -d megaraid,18  /dev/sdb

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:    7950078        0         0   7950078    7950078        660.801           0
write:         0        0         0         0          0        363.247           0
verify:       12        0         0        12         12          0.002           0

Non-medium error count:  3253718

除该驱动器(磁盘ID 18)外,其他所有驱动器均显示非中等错误计数0。我确定了磁盘,将其更换为新磁盘,然后又恢复了3gbps的读取速度。

根据smartmontools Wiki

显示的错误日志(如果有)显示在单独的行上:

  • 写错误计数器

  • 读取错误计数器

  • 验证错误计数器(仅在非零时显示)

  • 非中等错误计数器(仅显示一个数字)。这表示除写入,读取或验证错误之外的可恢复事件的数量。

  • 错误事件保存在“最后n个错误事件”日志页面中。保留的错误事件记录的数量(即“ n”)是特定于供应商的(例如,对于Hitachi 10K300型号磁盘,最多保留23条记录)。每个错误事件记录的内容均以ASCII和特定于供应商的形式表示。与每个错误事件记录关联的参数代码指示发生错误事件的相对时间。较高的参数代码表示错误事件在稍后发生。如果设备不支持此日志页面,则输出“不支持错误事件日志记录”。如果支持此日志页面并且有错误事件记录,则每个记录都以“ Error event:”作为前缀,其中是参数代码。


我再次点击它,所以如果您不介意的话,我会在您的答案中做笔记。
Michal Sokolowski

0

您需要验证驱动器的碎片:

xfs_db -r /dev/sdbx
frag

您将得到一个这样的答案:

actual 347954, ideal 15723, fragmentation factor 95.48%

如果碎片系数很高,则需要对磁盘进行碎片整理。(是的,我知道,就像在Windows上一样...):/

整理磁盘碎片: xfs_fsr -v /dev/sdbx


0

有了LSI,确实有很多事情很重要。

1)刷新RAID固件。您目前的情况还差一些。

2)刷新驱动器上的固件,并确保也将其更新。

3)更新您的驱动程序。根据LSI网站上的发行说明,他们刚刚在1月底发布了新的驱动程序。

然后,您可以重新运行测试以查看是否有任何更改。

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.