参考文档:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.6.0/
hdfs命令是hadoop提供的操作HDFS分布式文件系统的shell命令客户端,我们可以通过该命令对分布式文件系统进行文件的增删查操作,也可以通过该命令获取到一些hadoop的相关配置信息,而且我们启动hdfs相关服务进程都是通过该命令进行的。
hdfs命令主要分为两类,一类是用户命令:dfs, fsck等,一类是管理员命令:dfsadmin,namenode,datanode等
1.命令: -ls -lsr
执行:hdfs dfs -ls /
区别:lsr是递归显示
[hadoop@hadoop-vm bin]$ ./hdfs dfs -ls /
17/07/16 16:54:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
drwxr-xr-x - hadoop supergroup 0 2017-05-07 16:22 /hbase
drwxr-xr-x - hadoop supergroup 0 2017-04-22 17:05 /logs
drwx------ - hadoop supergroup 0 2017-03-19 19:29 /tmp
drwxr-xr-x - hadoop supergroup 0 2017-03-19 19:42 /user
2.命令: -mkdir
执行:hdfs dfs -mkdir -p /edu/hdfs/mkdir
-p指定当需要创建的文件夹存储,那么不报错,默认情况会报错,递归的创建文件夹。 如果我们给定的路径不是以’/’开始的,那么表示在当前用户目录下创建文件夹。(默认情况下是没有当前用户目录的,那么就会报错)
[hadoop@hadoop-vm bin]$ ./hdfs dfs -mkdir -p /edu/hdfs/mkdir
17/07/16 16:56:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
3.命令: -put -copyFromLocal -moveFromLocal
执行:hdfs dfs -put /home/hadoop/bigdater/ /edu/put
本地路径可以指定文件夹或者多个文件,hdfs上的路径必须是根据上传东西的不同,有不同的要求。
1.本地指定的是文件夹,那么hdfs如果目录不存在,就新建目录然后将本地文件夹内容copy过去;hdfs目录存在,则将文件夹copy过去。
2.本地指定的是单个文件,那要求hdfs上指定的文件不存在
3.本地指定的是多个文件,那么要求hdfs上指定的文件夹存在。
[hadoop@hadoop-vm bin]$ ./hdfs dfs -put /home/hadoop/access.log /edu/put
17/07/16 17:01:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
命令: -get -copyToLocal -moveToLocal
执行:hdfs dfs -get /edu/put ./
get命令和put命令是一对相反命令。put是从本地到集群,get是从集群到本地。基本语法相似。
4.命令: -cat -text
执行:hdfs dfs -cat /edu/test.txt
cat命令和text命令都可以查看文件内容,但是它们的内置机制不一样,cat是copy文件内容,然后显示;text是通过hadoop解析将文件内容转化为文本内容,然后在显示。cat命令只适合看一半的文本文件,而text命令可以看出所有文件
[hadoop@hadoop-vm bin]$ ./hdfs dfs -cat /edu/put
17/07/16 17:04:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
192.168.1.113^A1489911237.467^A192.168.1.230^A/BfImg.gif
192.168.1.113^A1489978305.820^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu123&c_time=1489891943955&en=e_cs&oid=orderid123&sdk=jdk&pl=java_server
192.168.1.113^A1489978305.854^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu456&c_time=1489891943971&en=e_cr&oid=orderid456&sdk=jdk&pl=java_server
192.168.1.113^A1489892586.268^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu123&c_time=1489892535468&en=e_cs&oid=orderid123&sdk=jdk&pl=java_server
192.168.1.113^A1489892586.270^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu456&c_time=1489892535470&en=e_cr&oid=orderid456&sdk=jdk&pl=java_server
192.168.1.113^A1489892638.003^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu123&c_time=1489892587206&en=e_cs&oid=orderid123&sdk=jdk&pl=java_server
192.168.1.113^A1489892638.006^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu456&c_time=1489892587208&en=e_cr&oid=orderid456&sdk=jdk&pl=java_server
192.168.1.113^A1489892777.976^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu123&c_time=1489892727190&en=e_cs&oid=orderid123&sdk=jdk&pl=java_server
192.168.1.113^A1489892777.979^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu456&c_time=1489892727193&en=e_cr&oid=orderid456&sdk=jdk&pl=java_server
192.168.1.113^A1489892825.590^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu123&c_time=1489892774806&en=e_cs&oid=orderid123&sdk=jdk&pl=java_server
192.168.1.113^A1489892825.592^A192.168.1.230^A/BfImg.gif?ver=1&u_mid=gerryliu456&c_time=1489892774808&en=e_cr&oid=orderid456&sdk=jdk&pl=java_server
5.命令: -rm -rmdir
执行:hdfs dfs -rm -R /beifeng/put
rm和rmdir的区别主要是:rm可以删除任何文件/文件夹,rmdir只能够删除空的文件夹。
- fsck命令是检测hdfs磁盘文件是否有丢失备份异常等信息,可以查看到具体的文件是否处于健康状况,执行命令为: hdfs -fsck
[hadoop@hadoop-vm bin]$ ./hdfs fsck /edu/
17/07/16 17:07:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop-vm:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.1.230 for path /edu/ at Sun Jul 16 17:07:44 CST 2017
.Status: HEALTHY
Total size: 1537 B
Total dirs: 3
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 1537 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Sun Jul 16 17:07:44 CST 2017 in 4 milliseconds
The filesystem under path '/edu/' is HEALTHY
7.命令:-report
执行:hdfs dfsadmin -report
可以通过该命令查看集群的基本信息,包括总磁盘大小,剩余磁盘大小,丢失块个数等总的集群信息
[hadoop@hadoop-vm bin]$ ./hdfs fsck /edu/
17/07/16 17:07:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://hadoop-vm:50070
FSCK started by hadoop (auth:SIMPLE) from /192.168.1.230 for path /edu/ at Sun Jul 16 17:07:44 CST 2017
.Status: HEALTHY
Total size: 1537 B
Total dirs: 3
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 1537 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Sun Jul 16 17:07:44 CST 2017 in 4 milliseconds
The filesystem under path '/edu/' is HEALTHY
[hadoop@hadoop-vm bin]$ ./hdfs dfsadmin -report
17/07/16 17:08:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 50595676160 (47.12 GB)
Present Capacity: 42751913984 (39.82 GB)
DFS Remaining: 42730168320 (39.80 GB)
DFS Used: 21745664 (20.74 MB)
DFS Used%: 0.05%
Under replicated blocks: 41
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: 192.168.1.230:50010 (hadoop-vm)
Hostname: hadoop-vm
Decommission Status : Normal
Configured Capacity: 50595676160 (47.12 GB)
DFS Used: 21745664 (20.74 MB)
Non DFS Used: 7843762176 (7.31 GB)
DFS Remaining: 42730168320 (39.80 GB)
DFS Used%: 0.04%
DFS Remaining%: 84.45%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Jul 16 17:08:35 CST 2017
8.命令:-safemode
[hadoop@hadoop-vm bin]$ ./hdfs namenode -h
Usage: hdfs namenode [-backup] |
[-checkpoint] |
[-format [-clusterid cid ] [-force] [-nonInteractive] ] |
[-upgrade [-clusterid cid] [-renameReserved<k-v pairs>] ] |
[-upgradeOnly [-clusterid cid] [-renameReserved<k-v pairs>] ] |
[-rollback] |
[-rollingUpgrade <rollback|downgrade|started> ] |
[-finalize] |
[-importCheckpoint] |
[-initializeSharedEdits] |
[-bootstrapStandby] |
[-recover [ -force] ] |
[-metadataVersion ] ]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
这里就演示这么多常用的几个,其他的查看帮助官方文档,实验尝试一些命令的功能