Linux CentOS6.5 安装hadoop3.0.3
1. 下载hadoop安装包
从官方网站http://hadoop.apache.org/releases.html下载hadoop对应的安装包hadoop-3.0.3.tar.gz(版本按需获取)。
2. 解压安装包
tar -zxvf hadoop-3.0.3.tar.gz -C /usr/local/hadoop-3.0.3/
将压缩包解压至本地软件安装目录 /usr/local/hadoop-3.0.0
3. 编辑vi /etc/profile 文件设置hadoop环境变量
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_181
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/usr/local/hadoop-3.0.3
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${PATH}
注意:确保安装hadoop之前已经安装java
配置之后,需要让配置生效,需执行:
source /etc/profile
4. 配置hadoop
(1)hadoop-env.sh文件设置JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_181
(2)core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
(3)hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hadoop/name</value>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>/data/hadoop/data</value>
</property>
</configuration>
(4)mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
采用yarn框架
(5) yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
5. 初始化HDFS(格式化文件系统)
hdfs namenode -format
6. 在安装目录/usr/local/hadoop-3.0.3/sbin下, 找到启动脚本start-dfs.sh 及 start-yarn.sh,启动nameNode, DataNode,SecondaryNameNode 以及 资源管理服务NodeManager和ResourceManager。尽量不要用root用户启动
启动时可能发生权限问题, 如:
[xxx@localhost sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting datanodes
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Starting secondary namenodes [localhost.localdomain]
localhost.localdomain: Warning: Permanently added 'localhost.localdomain' (RSA) to the list of known hosts.
localhost.localdomain: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
2018-09-07 01:03:52,287 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
7. 配置启动用户ssh无密访问localhost机器, 解决第五步中的启动权限问题
(1)启动用户登录下,进入~/.ssh目录,通过ssh-keygen -t rsa命令生成公钥和私钥,之后一直回车:
[xxx@localhost .ssh]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/xxx/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/xxx/.ssh/id_rsa.
Your public key has been saved in /home/xxx/.ssh/id_rsa.pub.
The key fingerprint is:
21:46:2d:36:bb:34:73:xx:xx:xx:xx:xx:xx:xx:xx:xx [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| .. ... E |
| .+ . .= + . |
| .o+..o = + |
| .=... + + o |
| . =S o . .|
| . |
| |
| |
| |
+-----------------+
执行完了之后会在当前目录生成公钥(id_rsa_.pub)和私钥(id_rsa)文件。
(2) 配置authorized_keys授权文件,将公钥内容追加到authorized_keys中,(若是访问其他主机,则需要将公钥上传至目标机器并追加公钥内容至目标机器authorized_keys文件中)
若无该文件,touch authorized_keys 生成一个。
#若无authorized_keys则生成
touch authorized_keys
#一定要设置authorized_keys文件权限为600
chmod 600 authorized_keys
#追加xxx用户的公钥至授权文件中
cat id_rsa.pub >> authorized_keys
(3)验证是否可以无密码登录localhost
[xxx@localhost hadoop]$ ssh localhost
Last login: Fri Sep 7 01:27:12 2018 from localhost
如此未输入密码登录成功, 则证明ssh配置成功。
A机器要是访问B机器,则A机器生成公钥私钥,将A生成的公钥上传至B机器, 若通过U用户登录B机器,则追加A的公钥到U的home/U/.ssh目录中的authorized_keys文件。
8. 再次执行启动脚本start-dfs.sh 及 start-yarn.sh
./start-dfs.sh
./start-yarn.sh
若无报错,一般启动成功。通过jps命令查看:
[xxx@localhost hadoop]$ jps
4753 NameNode
5377 ResourceManager
5068 SecondaryNameNode
5484 NodeManager
4863 DataNode
7343 Jps
9. 访问网站资源