1.hdfs设计原理
负载均衡、考虑分布式计算
-->block机制
2.block默认大小为128m【hadoop2.0版本以上】
原因:默认传输效率100M/s 寻址时间占传输时间的1% 寻址时间为10ms
block默认设置参数:
<property>
<name>dfs.blocksize</name>
<value>10</value>
<description>
新文件的默认块大小(以字节为单位)。
您可以使用以下后缀(不区分大小写):k,m,g,t,p,e指定大小(如128k,512m,1g等)
或者以字节为单位提供完整的大小(例如128MB的134217728)。
</description>
</property>
修改block大小,参照dfs.namenode.fs-limits.min-block-size,默认1M
<property>
<name>dfs.namenode.fs-limits.min-block-size</name>
<value>10</value>
<description>
Minimum block size in bytes, enforced by the Namenode at create
time. This prevents the accidental creation of files with tiny block
sizes (and thus many blocks), which can degrade performance.
以字节为单位的最小块大小,在创建时由Namenode强制执行。这可以防止偶然地创建具有小块大小的文件(以及许多 块),从而降低性能
</description>
</property>
***************************************************************
<property>
<name>dfs.bytes-per-checksum</name>
<value>10</value>
<description>
The number of bytes per checksum. Must not be larger than dfs.stream-buffer-size
每个校验和的字节数。不能大于dfs. streambuffer的大小
</description>
</property>
总结:block是物理的,真正存储的位置在本地磁盘{hadoop.tmp.dir}/dfs/data
block是针对文件说的,按照块号存储的物理文件
分析webUI:50070的描述
---------------------------------
1.Startup Progress页【集群开启的过程】
Elapsed Time: 1 sec, Percent Complete: 100%
Phase Completion Elapsed Time
【加载fsimage镜像文件】 Loading fsimage /home/hyxy/tmp/hadoop/dfs/name/current/fsimage_0000000000000000010 351 B 100% 0 sec
inodes (0/0) 100%
delegation tokens (0/0) 100%
cache pools (0/0) 100%
Loading edits 100% 0 sec
【加载edits编辑文件】 /home/hyxy/tmp/hadoop/dfs/name/current/edits_0000000000000000011-0000000000000000011 1 MB (1/1) 100%
/home/hyxy/tmp/hadoop/dfs/name/current/edits_0000000000000000012-0000000000000000012 1 MB (1/1) 100%
/home/hyxy/tmp/hadoop/dfs/name/current/edits_0000000000000000013-0000000000000000015 1 MB (3/3) 100%
Saving checkpoint 100% 0 sec
Safe mode 100% 0 sec
awaiting reported blocks (0/0)