MySQL InnoDB崩溃验尸


28

MySQL今天早上对我崩溃了。

除了标准的MySQL随附数据库外,我使用的都是InnoDB。

我尝试重新启动MySQL守护程序,但失败两次。

然后,我重新启动了整个服务器,MySQL正确启动,并且自此以来一直运行良好。

初始崩溃的mysqld日志文件包含以下内容:

120927 10:21:05 mysqld_safe Number of processes running now: 0
120927 10:21:06 mysqld_safe mysqld restarted
120927 10:21:12 [Note] Plugin 'FEDERATED' is disabled.
120927 10:21:12 InnoDB: The InnoDB memory heap is disabled
120927 10:21:12 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120927 10:21:12 InnoDB: Compressed tables use zlib 1.2.3
120927 10:21:12 InnoDB: Using Linux native AIO
120927 10:21:13 InnoDB: Initializing buffer pool, size = 4.0G
InnoDB: mmap(4395630592 bytes) failed; errno 12
120927 10:21:13 InnoDB: Completed initialization of buffer pool
120927 10:21:13 InnoDB: Fatal error: cannot allocate memory for the buffer pool
120927 10:21:13 [ERROR] Plugin 'InnoDB' init function returned error.
120927 10:21:13 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
120927 10:21:13 [ERROR] Unknown/unsupported storage engine: InnoDB
120927 10:21:13 [ERROR] Aborting

120927 10:21:13 [Note] /usr/libexec/mysqld: Shutdown complete

120927 10:21:13 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

尝试重新启动守护程序时,mysqld日志文件包含:

120927 10:43:44 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
120927 10:43:44 [Note] Plugin 'FEDERATED' is disabled.
120927 10:43:44 InnoDB: The InnoDB memory heap is disabled
120927 10:43:44 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120927 10:43:44 InnoDB: Compressed tables use zlib 1.2.3
120927 10:43:44 InnoDB: Using Linux native AIO
120927 10:43:44 InnoDB: Initializing buffer pool, size = 4.0G
InnoDB: mmap(4395630592 bytes) failed; errno 12
120927 10:43:44 InnoDB: Completed initialization of buffer pool
120927 10:43:44 InnoDB: Fatal error: cannot allocate memory for the buffer pool
120927 10:43:44 [ERROR] Plugin 'InnoDB' init function returned error.
120927 10:43:44 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
120927 10:43:44 [ERROR] Unknown/unsupported storage engine: InnoDB
120927 10:43:44 [ERROR] Aborting

120927 10:43:44 [Note] /usr/libexec/mysqld: Shutdown complete

120927 10:43:44 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

服务器重启后,mysqld日志文件包含:

120927 10:46:11 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
120927 10:46:11 [Note] Plugin 'FEDERATED' is disabled.
120927 10:46:11 InnoDB: The InnoDB memory heap is disabled
120927 10:46:11 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120927 10:46:11 InnoDB: Compressed tables use zlib 1.2.3
120927 10:46:11 InnoDB: Using Linux native AIO
120927 10:46:11 InnoDB: Initializing buffer pool, size = 4.0G
120927 10:46:11 InnoDB: Completed initialization of buffer pool
120927 10:46:12 InnoDB: highest supported file format is Barracuda.
InnoDB: The log sequence number in ibdata files does not match
InnoDB: the log sequence number in the ib_logfiles!
120927 10:46:12  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
120927 10:46:15  InnoDB: Waiting for the background threads to start
120927 10:46:16 InnoDB: 1.1.8 started; log sequence number 57665645675
120927 10:46:16 [Note] Event Scheduler: Loaded 0 events
120927 10:46:16 [Note] /usr/libexec/mysqld: ready for connections.
Version: '5.5.21-cll'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL) by Atomicorp

我从来没有尝试解密崩溃的MySQL日志文件。

我正在使用版本:5.5.21-cll Atomicorp的MySQL Community Server(GPL)

关于我应该从哪里开始的任何想法?

更新:根据@ Michael-sqlbot的建议,我拉出了syslog并发现了这一点:

Sep 27 10:20:58 ip-97-74-197-181 kernel: pcscd invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
Sep 27 10:21:00 ip-97-74-197-181 kernel:
Sep 27 10:21:00 ip-97-74-197-181 kernel: Call Trace:
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff800c9f35>] out_of_memory+0x8e/0x2f3
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8002dfc7>] __wake_up+0x38/0x4f
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8000f67d>] __alloc_pages+0x27f/0x308
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff80017a84>] cache_grow+0x139/0x3c7
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8005be28>] cache_alloc_refill+0x138/0x188
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8000ad2e>] kmem_cache_alloc+0x6c/0x76
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff80012877>] getname+0x25/0x1c2
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8001a04b>] do_sys_open+0x17/0xbe
Sep 27 10:21:00 ip-97-74-197-181 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 27 10:21:00 ip-97-74-197-181 kernel:
Sep 27 10:21:11 ip-97-74-197-181 kernel: Mem-info:
Sep 27 10:21:20 ip-97-74-197-181 kernel: Node 0 DMA per-cpu:
Sep 27 10:21:27 ip-97-74-197-181 kernel: cpu 0 hot: high 0, batch 1 used:0
Sep 27 10:21:38 ip-97-74-197-181 kernel: cpu 0 cold: high 0, batch 1 used:0
Sep 27 10:21:49 ip-97-74-197-181 kernel: cpu 1 hot: high 0, batch 1 used:0
Sep 27 10:21:49 ip-97-74-197-181 kernel: cpu 1 cold: high 0, batch 1 used:0
Sep 27 10:21:49 ip-97-74-197-181 kernel: cpu 2 hot: high 0, batch 1 used:0
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 2 cold: high 0, batch 1 used:0
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 hot: high 0, batch 1 used:0
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 cold: high 0, batch 1 used:0
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 DMA32 per-cpu:
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 0 hot: high 186, batch 31 used:60
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 0 cold: high 62, batch 15 used:57
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 1 hot: high 186, batch 31 used:139
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 1 cold: high 62, batch 15 used:61
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 2 hot: high 186, batch 31 used:47
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 2 cold: high 62, batch 15 used:57
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 hot: high 186, batch 31 used:52
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 cold: high 62, batch 15 used:53
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 Normal per-cpu:
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 0 hot: high 186, batch 31 used:29
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 0 cold: high 62, batch 15 used:17
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 1 hot: high 186, batch 31 used:178
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 1 cold: high 62, batch 15 used:52
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 2 hot: high 186, batch 31 used:22
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 2 cold: high 62, batch 15 used:59
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 hot: high 186, batch 31 used:71
Sep 27 10:21:52 ip-97-74-197-181 kernel: cpu 3 cold: high 62, batch 15 used:54
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 HighMem per-cpu: empty
Sep 27 10:21:52 ip-97-74-197-181 kernel: Free pages:       41728kB (0kB HighMem)
Sep 27 10:21:52 ip-97-74-197-181 kernel: Active:1031140 inactive:970428 dirty:0 writeback:0 unstable:0 free:10432 slab:4277 mapped-file:801 mapped-anon:1993003 pagetables:11636
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 DMA free:10096kB min:12kB low:12kB high:16kB active:0kB inactive:0kB present:9700kB pages_scanned:0 all_unreclaimable? yes
Sep 27 10:21:52 ip-97-74-197-181 kernel: lowmem_reserve[]: 0 2965 8015 8015
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 DMA32 free:24424kB min:4236kB low:5292kB high:6352kB active:1544164kB inactive:1428756kB present:3037024kB pages_scanned:7185900 all_unreclaimable? yes
Sep 27 10:21:52 ip-97-74-197-181 kernel: lowmem_reserve[]: 0 0 5050 5050
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 Normal free:7208kB min:7212kB low:9012kB high:10816kB active:2580172kB inactive:2453052kB present:5171200kB pages_scanned:12935183 all_unreclaimable? yes
Sep 27 10:21:52 ip-97-74-197-181 kernel: lowmem_reserve[]: 0 0 0 0
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Sep 27 10:21:52 ip-97-74-197-181 kernel: lowmem_reserve[]: 0 0 0 0
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 DMA: 6*4kB 3*8kB 4*16kB 4*32kB 4*64kB 5*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 2*4096kB = 10096kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 DMA32: 24*4kB 3*8kB 1*16kB 1*32kB 1*64kB 3*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 5*4096kB = 24424kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 Normal: 0*4kB 13*8kB 8*16kB 0*32kB 19*64kB 1*128kB 2*256kB 0*512kB 1*1024kB 0*2048kB 1*4096kB = 7208kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Node 0 HighMem: empty
Sep 27 10:21:52 ip-97-74-197-181 kernel: 9391 pagecache pages
Sep 27 10:21:52 ip-97-74-197-181 kernel: Swap cache: add 5745145, delete 5744809, find 81873079/82270945, race 0+63
Sep 27 10:21:52 ip-97-74-197-181 kernel: Free swap  = 0kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Total swap = 2096472kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Free swap:            0kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: 2359296 pages of RAM
Sep 27 10:21:52 ip-97-74-197-181 kernel: 324458 reserved pages
Sep 27 10:21:52 ip-97-74-197-181 kernel: 21388 pages shared
Sep 27 10:21:52 ip-97-74-197-181 kernel: 336 pages swap cached
Sep 27 10:21:52 ip-97-74-197-181 kernel: Out of memory: Killed process 3044, UID 27, (mysqld).

Answers:


34

我有好消息,也有坏消息。好消息是,您的文件系统和mysql最有可能...但是,请检查/ var / log / syslog或等效文件,以查看10:21:05之前系统上还发生了什么。

当您发布的第一条消息被记录时,您的mysql服务器已经死亡。

120927 10:21:05 mysqld_safe Number of processes running now: 0

因此,假设您没有忽略mysql错误日志中的任何内容,那么我要说的是它没有崩溃并死掉-它实际上已被杀死。

当mysqld_safe(这是一个包装程序,而不是服务器本身)意识到服务器未运行并且服务器没有正常终止时,它将为您重新启动它。

120927 10:21:06 mysqld_safe mysqld restarted

...然后服务器守护程序记录了一些正常的启动消息...

120927 10:21:12 [Note] Plugin 'FEDERATED' is disabled.
120927 10:21:12 InnoDB: The InnoDB memory heap is disabled
120927 10:21:12 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120927 10:21:12 InnoDB: Compressed tables use zlib 1.2.3
120927 10:21:12 InnoDB: Using Linux native AIO

...但是当mysqld要求操作系统为InnoDB缓冲池分配4GB内存时...

120927 10:21:13 InnoDB: Initializing buffer pool, size = 4.0G

...内核说“不”。

InnoDB: mmap(4395630592 bytes) failed; errno 12

检查内核源以确保:

#define ENOMEM      12  /* Out of memory */

是的 因此,“ fail; errno 12”行以下的每条消息都应该被忽略-它们都是该消息的副作用。

但同样,所有这些事情都是第一次坠机之后发生的。

我最好的猜测是,内存极低的情况导致您的内核最初杀死mysqld来稳定系统。

自然,重新启动后导致内存不足的任何原因都消失了。mysql服务器能够为InnoDB缓冲池分配4GB的空间,并且一切都会好起来,直到导致内存不足的原因再次导致它。

首先猜测:apache子进程运行不正常。


3

这件事最近发生在我身上,该线程对于帮助我了解正在发生的事情非常宝贵。

我正在具有1GB Ram的Ubuntu 14.04上运行LAMP堆栈。由于网络流量激增,我的服务器一直崩溃。对我来说,摆弄mysql配置文件只会延长我经历另一个随机崩溃的时间。为了测试并解决此问题,我最终使用了Apache的ab工具:

ab -n 100 -c 10 http://gastonia.com/

10个并发(-c)可以,因此增加直到我达到30-bam-崩溃。我退后一步,直到找到一个安全的数字,然后调整了Apache的ServerLimit指令:

ServerLimit 20

之后,我可以将-c更改为我想要的任何数字,而我还没有遇到另一次崩溃。

希望对任何遇到相同问题的人都有帮助。


非常感谢!我当时正在用MySQL拔头发。当我看到此过程时,对于我的情况来说似乎是可行的。我尝试过,它完全按照您在测试中描述的那样工作!
zkarj

1

我遇到了与最初发布的问题相同的问题,并且在神秘的Mysql_safe重新启动后以随机的间隔重复出现了InnoDB损坏。

我通常能够通过首先停止Apache来重新启动Mysql。

看完这篇文章后,我查看了系统日志,发现:内核:内存不足:杀死进程19468(mysqld)得分256或以与神秘的mysql崩溃相同的时间戳牺牲孩子。

我还将其与流量激增和httpd(apache)进程堆栈匹配。我减小了innoDB池的大小,稍微增加了交换大小,并限制了Apache进程的最大数量,以防万一。


0

当你有

Sep 27 10:21:52 ip-97-74-197-181 kernel: Swap cache: add 5745145, delete 5744809, find 81873079/82270945, race 0+63
Sep 27 10:21:52 ip-97-74-197-181 kernel: Free swap  = 0kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Total swap = 2096472kB
Sep 27 10:21:52 ip-97-74-197-181 kernel: Free swap:            0kB

尝试检测哪些进程占用了您的内存并进行了交换:

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | less

然后杀死所有:

ps -ef | grep eatingprocess |  grep -v grep  |  awk '{print $2}' | xargs kill -9
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.