HDFS写数据流程:
创建hdfs Java工程 , 创建User Library hdfslib
hadoop-2.6.5\share\hadoop\hdfs\hadoop-hdfs-2.6.5.jar
hadoop-2.6.5\share\hadoop\hdfs\lib\全选
hadoop-2.6.5\share\hadoop\common\hadoop-common-2.6.5.jar
hadoop-2.6.5\share\hadoop\common\lib\全选
Java工程引入User Library hdfslib
mapreduce框架的整体工作过程:
hdfs创建wordcount/input目录 hdfs dfs -mkdir -p /wordcount/input
创建a.txt文本 vi a.txt
i love you angelababy
i love you liuyifei
i love you tangyan
i love you zhaoliying
i love you fanbingbing
i love you gaoshumaliya
i love you java
i love you scala
复制10份a.txt
cp a.txt a.txt.2
cp a.txt a.txt.3
cp a.txt a.txt.4
cp a.txt a.txt.5
cp a.txt a.txt.6
cp a.txt a.txt.7
cp a.txt a.txt.8
cp a.txt a.txt.9
cp a.txt a.txt.10
mapreduce框架的整体工作过程 hdfs创建wordcount/input目录 hdfs dfs -mkdir -p /wordcount/input 创建a.txt文本 vi a.txt i love you angelababy i love you liuyifei i love you tangyan i love you zhaoliying i love you fanbingbing i love you gaoshumaliya i love you java i love you scala 复制10份a.txt cp a.txt a.txt.2 cp a.txt a.txt.3 cp a.txt a.txt.4 cp a.txt a.txt.5 cp a.txt a.txt.6 cp a.txt a.txt.7 cp a.txt a.txt.8 cp a.txt a.txt.9 cp a.txt a.txt.10 上传10份a.txt至hdfs wordcount/input目录 hadoop fs -put a.* /wordcount/input
|
mapreduce编程实例wordcount----mapper的编写 创建mapreduce Java工程 , 创建User Library mrlib hadoop-2.6.5\share\hadoop\mapreduce\全部 hadoop-2.6.5\share\hadoop\mapreduce\lib\全部 hadoop-2.6.5\share\hadoop\yarn\全部 hadoop-2.6.5\share\hadoop\yarn\lib\全部 Java工程引入User Library hdfslib mrlib 编写WordCountMapper类
|
mapreduce编程实例wordcount----reducer的编写 编写WordCountReducer类
|
mapreduce编程实例wordcount----job提交客户端程序的编写 编写jobClient类 上一条命令返回状态码 echo $? ls echo $? 0 true echo $? 0 false echo $? 1 dirr echo $? 127 service iptables xxxooo echo $? 2 把mapreduce Java工程Export成wordcount.jar包
|
mapreduce编程实例wordcount----程序提交运行的过程 上传d盘根目录wordcount.jar至centos001~目录 sftp> put d:/wordcount.jar 执行wordcount.jar: hadoop jar wordcount.jar com.dohit.hadoop.JobClient 查看yarn运行状态 172.17.1.28:8088 查看output目录执行结果 hadoop fs -cat /wordcount/output/part-r-00001 创建b.txt文本 vi b.txt a b c d e f g g i j k l m n h k j 上传b.txt至hdfs wordcount/input目录 hadoop fs -put b.txt /wordcount/input 执行wordcount.jar hadoop jar wordcount.jar com.dohit.hadoop.JobClient 删除output目录 hadoop fs -rm -r /wordcount/output 查看yarn运行状态 172.17.1.28:8088 查看output目录执行结果 hadoop fs -cat /wordcount/output/part-r-00001
|