集群环境规划
centos102 | centos103 | centos104 |
---|---|---|
NameNode | ResourceManager | SecondaryNameNode |
DataNode | DataNode | DataNode |
NodeManager | NodeManager | NodeManager |
配置免密登录
生成密钥对ssh-keygen -t rsa
一路回车即可
将本机公钥拷贝到目标机器 ssh-copy-id 目标机器hostname
如果本机没有ssh-copy-id命令,需要自己安装。
安装ssh-copy-id(https://blog.csdn.net/u014609263/article/details/89448245)
sudo yum -y install openssh-clients
在三台机器上都执行以下4条指令,配置三台机器间的免密登录
ssh-keygen -t rsa
ssh-copy-id centos102
ssh-copy-id centos103
ssh-copy-id centos104
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://centos102:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/moudle/hadoop-2.7.3/data</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<!-- 指定HDFS副本数量 -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 指定SecondaryNameNode -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>centos104:50090</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>centos103</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
etc/hadoop/slaves
centos102
centos103
centos104
同步以上配置到其他机器
xsync /opt/moudle/hadoop-2.7.3/etc/
在NameNode机器centos102 上执行格式化hdfs namenode -format
在centos102上执行start-dfs.sh
在centos103上执行start-yarn.sh
通过jps 命令查看
[hadoop@centos102 ~]$ jps
25552 NodeManager
24389 DataNode
27803 Jps
24190 NameNode
[hadoop@centos103 ~]$ jps
24978 ResourceManager
25130 NodeManager
27996 Jps
23980 DataNode
[hadoop@centos104 ~]$ jps
28049 Jps
23991 DataNode
25081 NodeManager
24233 SecondaryNameNode
访问 HDFS http://192.168.56.102:50070/
访问 YARN http://192.168.56.103:8088/
执行WordCount程序,在NameNode机器centos102上操作。
待统计文件wc.txt 内容如下,一行使用tab隔开
[hadoop@centos102 ~]$ cat wc.txt
helloworld test
hadoop
centos
hdfs
yarn
namenode datanode
创建目录 hdfs dfs -mkdir /wcinput
上传待统计的文件wc.txt
hdfs dfs -put wc.txt /wcinput/
执行MapReduce程序
hadoop jar /opt/moudle/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /wcinput/ /wcoutput
查看执行结果
hdfs dfs -cat /wcoutput/*
[hadoop@centos102 ~]$ hdfs dfs -cat /wcoutput/*
centos 1
datanode 1
hadoop 1
hdfs 1
helloworld 1
namenode 1
test 1
yarn 1