hdfs文件系统的健康检查以及修复数据块
检查hdfs 文件系统的健康状况
hdfs fsck / :从根目录检查每个文件的数据块是否损坏、丢失
[hadoop@ruozedata001 sbin]$ hdfs fsck /
Connecting to namenode via http://ruozedata002:50070/fsck?ugi=hadoop&path=%2F
FSCK started by hadoop (auth:SIMPLE) from /192.168.72.201 for path / at Wed Aug 21 00:29:22 CST 2019
Status: HEALTHY
Total size: 0 B
Total dirs: 7
Total files: 0
Total symlinks: 0
Total blocks (validated): 0
Minimally replicated blocks: 0
Over-replicated blocks: 0
Under-replicated blocks: 0
Mis-replicated blocks: 0
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 0
Missing replicas: 0
Number of data-nodes: 3
Number of racks: 1
FSCK ended at Wed Aug 21 00:29:22 CST 2019 in 1 milliseconds
The filesystem under path '/' is HEALTHY
删除损坏的文件 hdfs fsck / -delete(慎重操作)
手动修复损坏文件 hdfs debug
hdfs debug recoverLease -path 文件位置 -retries 重复次数
hdfs debug recoverLease -path /xxx/yyy/aa.txt -retries 10
自动修复
hdfs会自动修复损坏的数据块,当数据块损坏后,
DN节点执行directoryscan(datanode进行内村和磁盘数据集块校验)操作之前,都不会发现损坏
directoryscan操作校验是间隔6h
dfs.datanode.directoryscan.interval:21600
在DN向NN进⾏blockreport前,都不会恢复数据块;也就是blockreport操作是间隔6h
dfs.blockreport.intervalMsec : 21600
最终当NN收到blockreport才会进⾏恢复操作
生产中倾向于使用手动修复的方法去修复损坏的数据块。