通过在stackoverflow查询发现此问题属于内核bug,解决方法是升级内核。
下面是把centos 7.0默认3.10版本内核升级为4.0.2版本过程
1、导入yum源的认证key
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
2、安装yum源
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm
3、安装新内核
在yum的ELRepo源中,有mainline(4.0.2)这个内核版本
[root@ip-10-10-29-201 ~]# yum --enablerepo=elrepo-kernel install kernel-ml-devel kernel-ml
Loaded plugins: fastestmirror
MooseFS | 951 B 00:00:00
base | 3.6 kB 00:00:00
elrepo | 2.9 kB 00:00:00
elrepo-kernel | 2.9 kB 00:00:00
extras | 3.4 kB 00:00:00
updates | 3.4 kB 00:00:00
(1/2): elrepo/primary_db | 233 kB 00:00:02
(2/2): elrepo-kernel/primary_db | 782 kB 00:00:04
MooseFS/primary | 4.2 kB 00:00:00
Loading mirror speeds from cached hostfile
* base: mirrors.yun-idc.com
* elrepo: repos.lax-noc.com
* elrepo-kernel: repos.lax-noc.com
* extras: mirror.bit.edu.cn
* updates: mirror.bit.edu.cn
MooseFS 30/30
Resolving Dependencies
--> Running transaction check
---> Package kernel-ml.x86_64 0:4.0.2-1.el7.elrepo will be installed
---> Package kernel-ml-devel.x86_64 0:4.0.2-1.el7.elrepo will be installed
--> Finished Dependency Resolution
Dependencies Resolved
==========================================================================================================================================================================
Package Arch Version Repository Size
==========================================================================================================================================================================
Installing:
kernel-ml x86_64 4.0.2-1.el7.elrepo elrepo-kernel 36 M
kernel-ml-devel x86_64 4.0.2-1.el7.elrepo elrepo-kernel 9.5 M
Transaction Summary
==========================================================================================================================================================================
Install 2 Packages
Total download size: 45 M
Installed size: 199 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): kernel-ml-4.0.2-1.el7.elrepo.x86_64.rpm | 36 MB 00:00:11
(2/2): kernel-ml-devel-4.0.2-1.el7.elrepo.x86_64.rpm | 9.5 MB 00:00:31
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 1.5 MB/s | 45 MB 00:00:31
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
Installing : kernel-ml-devel-4.0.2-1.el7.elrepo.x86_64 1/2
Installing : kernel-ml-4.0.2-1.el7.elrepo.x86_64 2/2
Verifying : kernel-ml-4.0.2-1.el7.elrepo.x86_64 1/2
Verifying : kernel-ml-devel-4.0.2-1.el7.elrepo.x86_64 2/2
Installed:
kernel-ml.x86_64 0:4.0.2-1.el7.elrepo kernel-ml-devel.x86_64 0:4.0.2-1.el7.elrepo
Complete!
4、查看当前内核版本
[root@ip-10-10-29-201 ~]# uname -r
3.10.0-123.el7.x86_64
重要:目前内核还是默认的版本,如果在这一步完成后你就直接reboot了,重启后使用的内核版本还是默认的3.10,不会使用新的4.0.2,想修改启动的顺序,需要进行下一步
查看默认启动顺序
[root@ip-10-10-29-201 ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.0.2-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux, with Linux 3.10.0-123.el7.x86_64
CentOS Linux, with Linux 0-rescue-18b184aa09434ecf9739a70c6b63638a
默认启动的顺序是从0开始,但我们新内核是从头插入(目前位置在1,而4.0.2的是在0),所以需要选择0,如果想生效最新的内核,需要
[root@ip-10-10-29-201 ~]# grub2-set-default 0
5、重启
Reboot
6、重启后查看内核
[root@ip-10-10-29-201 conf]# uname -r
4.0.2-1.el7.elrepo.x86_64
经过升级后,20天没有出现此问题,所以判断此次文件为内核bug引起,通过升级内核解决。