linux开发中常常会遇到内存泄漏挂机的问题
先上内核日志:
[134812.769862] Normal free:1000kB min:1000kB low:1248kB high:1500kB active_anon:20644kB inactive_anon:22328kB active_file:68kB inactive_file:160kB unevictable:5804kB isolated(anon):0kB isolated(file):0kB present:69632kB managed:62500kB mlocked:0kB dirty:0kB writeback:0kB mapped:1684kB shmem:22352kB slab_reclaimable:1028kB slab_unreclaimable:3776kB kernel_stack:904kB pagetables:572kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2157 all_unreclaimable? yes
[134812.865901] lowmem_reserve[]: 0 0
[134812.869418] Normal: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB = 1000kB
[134812.903645] 7096 total pagecache pages
[134812.907603] 0 pages in swap cache
[134812.921683] Swap cache stats: add 0, delete 0, find 0/0
[134812.927157] Free swap = 0kB
[134812.951705] Total swap = 0kB
[134812.955627] 17408 pages RAM
[134812.958804] 1466 pages reserved
[134812.969337] 265532 pages shared
[134812.973385] 14685 pages non-shared
[134812.981683] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[134812.989877] [ 73] 0 73 457 111 5 0 0 getty
[134813.017899] [ 87] 0 87 6904 298 7 0 0 ipc_upgrade
[134813.041678] [ 158] 0 158 658 185 4 0 0 sys_cmd_server
[134813.057902] [ 164] 0 164 170639 5245 105 0 0 ipc_app
[134813.081678] [ 166] 0 166 457 120 4 0 0 sh
[134813.097900] [ 171] 0 171 240 63 3 0 0 wdt
[134813.121678] [ 293] 0 293 455 81 4 0 0 telnetd
[134813.137896] [ 303] 0 303 457 130 4 0 0 sh
[134813.151711] [26147] 0 26147 453 73 3 0 0 sleep
[134813.159967] Out of memory: Kill process 164 (ipc_app) score 307 or sacrifice child
[134813.191722] Killed process 164 (ipc_app) total-vm:682556kB, anon-rss:19584kB, file-rss:1396kB
Normal free:1000kB min:1000kB 可以看到当前内存只有1000kB,达到了系统允许的最小值(min), 内核即筛选出最适合杀掉的进程杀掉导致系统崩溃。
关于这类问题可以从几个方面入手:
1、通过free命令观察 free的变化趋势
2、top -d 1 观察used, free, cached
3、 top的情况下 按 <shift> + <s> 观察对应程序的RSS变化趋势
4、netstat -a 观察 Rece 和 Send 是否有没有及时消费掉的情况
5、系统管理物理内存的debug信息。跟[134812.869418] Normal: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB 0*65536kB = 1000kB 一一对应起来, 可以用来观察不同大小的连续内存块个数的变化趋势, 用以确定是否跟内存碎片有关。(可以搜索一下伙伴系统了解一下知识)
6、另外可以使用malloc_trim测试验证一下。
malloc_trim(0); //从堆中释放空闲内存