我遇到了一些btrfs和ext4错误。决定测试我的RAM后,出现了下列重复错误memtester
。在运行.NET之后,我总是会得到类似的错误memtester
。通常在一个小时内完成,但一次却需要4-5个小时。
我的计算机的RAM已焊接。我有额外的空插槽。BIOS中没有设置来禁用板载RAM。
我跑了:
- Memtest86 + 8次通过(约8小时)
- MemTest86历时18次(〜9小时)
memtester
并stressapptest
默认在Fedora 27上安装在USB记忆棒上(约10个小时)memtester
而stressapptest
在Ubuntu 17.10默认直播(约2小时)memtester
并stressapptest
在USB记忆棒上的Ubuntu 17.10上运行(约8小时)# debsums --changed
唯一更改的文件是主题图像。
他们没有打印任何错误。
我使用默认内核的Ubuntu 17.10(从17.04升级)。内核没有被污染。这是配备Intel Haswell i3的ASUS笔记本电脑。
- 还通过Linux 4.14.13和4.15.0-rc3,rc4,mainline进行了测试。
- 还使用清除的英特尔微码程序包进行了测试。
错误是可重现的,或者Nouveau被禁用或启用,没有nvidia二进制驱动程序被加载。
将以下模块列入黑名单: mtd
intel_spi_platform
intel_spi
因为它们不会在默认的Fedora 27安装中加载,并且似乎使某些Lenova笔记本电脑变得砖砌。错误没有停止。
uname -a
的输出
Linux hostname 4.13.0-19-generic #22-Ubuntu SMP Mon Dec 4 11:58:07 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
# lsmod
的输出
https://paste.ubuntu.com/26222245/
Fedora 27的# lsmod
输出
https://paste.ubuntu.com/26226473/
现在的情况
我将硬盘驱动器放入了我认为很好的笔记本电脑(备份笔记本电脑)中,并在那里进行了测试。我得到了错误。现在,我很确定这是软件问题。全新的Ubuntu或Fedora尝试许多小时后,我再也无法在笔记本电脑上触发错误。
我该怎么办?
错误示例:
Loop 6:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : testing 262
FAILURE: 0x00000000 != 0xfffffffeffffffff at offset 0x0ef94000.
FAILURE: 0x00000000 != 0x100000000 at offset 0x0ef94008.
FAILURE: 0x00000000 != 0xfffffffeffffffff at offset 0x0ef94010.
FAILURE: 0x00000000 != 0x100000000 at offset 0x0ef94018.
FAILURE: 0x00000000 != 0xfffffffeffffffff at offset 0x0ef94020.
FAILURE: 0x00000000 != 0x100000000 at offset 0x0ef94028.
FAILURE: 0x00000000 != 0xfffffffeffffffff at offset 0x0ef94030.
FAILURE: 0x00000000 != 0x100000000 at offset 0x0ef94038.
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
两个RAM插槽都充满了类似的错误:
Loop 1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : testing 4
FAILURE: 0x00000000 != 0x00000050 at offset 0x7da80000.
FAILURE: 0x00000000 != 0xffffffffffffffaf at offset 0x7da80008.
FAILURE: 0x00000000 != 0x00000050 at offset 0x7da80010.
FAILURE: 0x00000000 != 0xffffffffffffffaf at offset 0x7da80018.
FAILURE: 0x00000000 != 0x00000050 at offset 0x7da80020.
FAILURE: 0x00000000 != 0xffffffffffffffaf at offset 0x7da80028.
FAILURE: 0x00000000 != 0x00000050 at offset 0x7da80030.
FAILURE: 0x00000000 != 0xffffffffffffffaf at offset 0x7da80038.
Bit Flip : setting 141
错误stressapptest
:
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e000(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e008(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e010(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e018(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e020(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e028(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e030(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
Report Error: miscompare : DIMM Unknown : 1 : 157s
Hardware Error: miscompare on CPU 2(0x2) at 0x7fcc0726e038(0xb0d18:DIMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0x4a4a4a4a4a4a4a4a
我怀疑将Ubuntu的配置与我的笔记本电脑的硬件相结合是造成这些错误的罪魁祸首。几乎每次都是八包。
以下无关紧要的相关信息
关于btrfs错误;我正在使用17.04。我在btrfs的irc中询问过。有人告诉我这可能是硬件错误,也可能是内存管理错误。就像我现在所经历的那样,btrfs的元数据页面的一部分充满了零。我确实跑了几个月,切换到ext4并归咎于nvidia二进制驱动程序。
我使用的命令及其参数:
# stressapptest -M 10000 -s 1800
10000是我可以测试的可用内存。我通过free -m
-s` 得到它是几秒钟。
# memtester 4096
笔记本电脑的CPU有2个核心,因此我通常启动两个实例。4096是当前可用内存的一半,通过free -m
memtest86+
从任何Ubuntu安装LiveCD。