准备工作
三台机器对应IP地址如下,hadoop1作为主节点
192.168.126.131 hadoop1
192.168.126.132 hadoop2
192.168.126.133 hadoop3
在每台机器上设置hostname,以hadoop1机器为例
[root@hadoop1 /]# vim /etc/sysconfig/network
设置内容
NETWORKING=yes
HOSTNAME=hadoop1
在每台机器上设置hostname和ip地址配置,以hadoop1机器为例
[root@hadoop1 /]# vim /etc/hosts
设置内容
192.168.126.131 hadoop1
192.168.126.132 hadoop2
192.168.126.133 hadoop3
ssh免密登录
在每台机器上运行 ssh-keygen -t rsa
[root@hadoop1 /]# ssh-keygen -t rsa
[root@hadoop2 /]# ssh-keygen -t rsa
[root@hadoop3 /]# ssh-keygen -t rsa
在hadoop1机器运行这三条命令
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop1
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop2
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop3
验证配置是否成功
[root@hadoop1 /]# ssh hadoop1
Last login: Sun Aug 26 08:22:01 2018 from 192.168.126.1
[root@hadoop1 ~]#
[root@hadoop1 /]# ssh hadoop2
Last login: Sun Aug 26 08:26:41 2018 from 192.168.126.1
[root@hadoop2 ~]#
[root@hadoop1 /]# ssh hadoop3
Last login: Sun Aug 26 08:32:57 2018 from 192.168.126.1
[root@hadoop3 ~]#
先在hadoop1机器上安装jdk和hadoop,步骤如下
安装jdk
获取jdk
[root@hadoop1 app]# wget http://p6rs3xelu.bkt.clouddn.com/jdk-8u161-linux-x64.tar.gz
解压jdk(这里解压到/soft/目录下):
[root@hadoop1 app]# tar -zxvf jdk-8u161-linux-x64.tar.gz -C /soft/
进入etc目录配置profile文件
[root@hadoop1 app]# vim /etc/profile
将以下内容粘贴到profile最后
export JAVA_HOME=/soft/jdk1.8.0_161
export JAVA_BIN=/soft/jdk1.8.0_161/bin
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATHSPH
执行配置文件
[root@hadoop1 app]# source /etc/profile
查看jdk是否安装成功
[root@hadoop1 app]# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
hadoop集群安装
(1)安装hadoop
获取hadoop
wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz
解压并指定安装目录
tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz -C /soft/
设置HADOOP_HOME环境变量,进入etc目录配置profile文件
[root@hadoop1 app]# vim /etc/profile
将以下内容粘贴到profile最后
export HADOOP_HOME=/soft/hadoop-2.6.0-cdh5.7.0
export PATH=$HADOOP_HOME/bin:$PATH
执行配置文件
[root@hadoop1 app]# source /etc/profile
验证环境变量是否配置成功
[root@hadoop1 app]# hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar <jar> run a jar file
checknative [-a|-h] check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
credential interact with credential providers
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
trace view and modify Hadoop tracing settings
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
配置hadoop配置文件(配置文件在/soft/hadoop-2.6.0-cdh5.7.0/etc/hadoop目录下),官网参考链接
hadoop-env.sh
注释掉: export JAVA_HOME=${JAVA_HOME},添加: export JAVA_HOME=/soft/jdk1.8.0_161,如下所示
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/soft/jdk1.8.0_161
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
</property>
<!--hadoop临时文件目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/app/temp/dfs/name</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/home/hadoop/app/temp/dfs/data</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--resourcemanager跑在哪个节点之上-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
</configuration>
mapred-site.xml
默认没有这个文件,需要复制一份模板文件。cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
slaves(这个文件是设置集群节点)
内容设置为
hadoop1
hadoop2
hadoop3
(2) 分发安装包到hadoop2和hadoop3中(因为hadoop1机器上的jdk和hadoop都安装在了/soft目录下,所以直接将/soft
目录下的所有文件复制过去)
root@hadoop1 /]# scp -r /soft root@hadoop2:/
root@hadoop1 /]# scp -r /soft root@hadoop3:/
root@hadoop1 /]# scp /etc/profile root@hadoop2:/etc/
root@hadoop1 /]# scp /etc/profile root@hadoop3:/etc/
在hadoop2和hadoop3机器上让/etc/profile生效
[root@hadoop2 /]# source /etc/profile
[root@hadoop3 /]# source /etc/profile
(3)格式化文件系统(仅第一次执行即可,不要重复执行),只要在hadoop1上执行即可
进入hadoop安装目录下的bin目录下执行:
[root@hadoop1 bin]# ./hdfs namenode -format
(4)启动集群
进入hadoop安装目录下的sbin目录下执行:
[root@hadoop1 sbin]# ./start-all.sh
(5)验证
方式一:通过jps验证
[root@hadoop1 sbin]# jps
NodeManager
DataNode
ResourceManager
Jps
SecondaryNameNode
NameNode
[root@hadoop2 /]# jps
Jps
DataNode
NodeManager
[root@hadoop3 /]# jps
Jps
NodeManager
DataNode
方式二:通过浏览器验证
在浏览器上输入:http://192.168.126.131:50070
(6)关闭集群
[root@hadoop1 sbin]# ./stop-all.sh
虚拟机上可能还需要关闭防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service