Hadoop分布式环境搭建
hadoop000: 192.168.199.102
hadoop001: 192.168.199.247
hadoop002: 192.168.199.138
hostname设置:sudo vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop00x(0/1/2)
hostname和ip地址的设置:sudo vi /etc/hosts
192.168.199.102 hadoop000
192.168.199.247 hadoop001
192.168.199.138 hadoop002
各节点角色分配:
hadoop000: NameNode/DataNode ResourceManager/NodeManager
hadoop001: DataNode NodeManager
hadoop002: DataNode NodeManager
1 前置安装
1)ssh免密码登陆
在每台机器上运行:ssh-keygen -t rsa
以hadooop000机器为主,示例
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop000
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop001
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop002
2) jdk安装
在hadoop000机器上解压jdk安装包,并设置JAVA_HOME到系统环境变量
2 集群安装
1)Hadoop安装
在hadoop000机器上解压Hadoop安装包,并设置HADOOP_HOME到系统环境变量
2)配置相关文件
hadoop-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_79
core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop000:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/app/tmp</value>
</property>
hdfs-site.xml
<property>
<name>dfs.secondary.http.address</name>
<value>192.168.18.99:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/app/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/app/tmp/dfs/data</value>
</property>
yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop000</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
slaves
hadoop000
hadoop001
hadoop002
3)分发安装包到hadoop001和hadoop002节点
scp -r ~/app hadoop@hadoop001:~/
scp -r ~/app hadoop@hadoop002:~/
scp ~/.bash_profile hadoop@hadoop001:~/
scp ~/.bash_profile hadoop@hadoop002:~/
在hadoop001和hadoop002机器上让.bash_profile生效
4)对NN做格式化:只要在hadoop000上执行即可
bin/hdfs namenode -format
5) 启动集群:只要在hadoop000上执行即可
sbin/start-all.sh
6) 验证
1 jps
hadoop000:
SecondaryNameNode
DataNode
NodeManager
NameNode
ResourceManager
hadoop001:
NodeManager
DataNode
hadoop002:
NodeManager
DataNode
2 web访问:
hadoop000:50070
hadoop000:8088
7) 集群停止:
stop-all.sh