文档地址 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
http://www.diaryfolio.com/hadoop-install-steps/
前期准备
解压到指定目录
tar -zxvf hadoop-2.3.0.tar.gz -C /data/javadev
添加hadoop用户和用户组
useradd hadoop usermod -g hadoop hadoop
查看hadoop用户
id hadoop
将解压的hadoop文件目录所属用户改为hadoop
chown -R hadoop:hadoop hadoop-2.3.0/
将hadoop添加到sudo中
visudo
在末尾添加一行
hadoop ALL=(ALL) ALL
保存
切换到hadoop用户
su - hadoop
环境变量和SSH配置
以下操作都使用hadoop用户
修改环境变量
vi .bashrc
export JAVA_HOME=/usr/java/jdk1.6.0_31/ #hadoop var HADOOP_COMMON_HOME=/data01/javaapp/hadoop-2.3.0 HADOOP_HDFS_HOME=/data01/javaapp/hadoop-2.3.0 HADOOP_MAPRED_HOME=/data01/javaapp/hadoop-2.3.0 HADOOP_YARN_HOME=/data01/javaapp/hadoop-2.3.0 HADOOP_CONF_DIR=/data01/javaapp/hadoop-2.3.0/etc/hadoop YARN_CONF_DIR=/data01/javaapp/hadoop-2.3.0/etc/hadoop export HADOOP_COMMON_HOME HADOOP_HDFS_HOME HADOOP_MAPRED_HOME HADOOP_YARN_HOME HADOOP_CONF_DIR YARN_CONF_DIR export PATH=$PATH:$HADOOP_COMMON_HOME/bin
保存
使环境变量立即生效
source .bashrc
设置SSH
ssh-keygen -t rsa -P "" cat id_dsa.pub >> ~/.ssh/authorized_keys chmod 644 authorized_keys cd ~ chmod 700 .ssh
hadoop配置文件
connect to host localhost port 22: Connection refused
SSH端口不是默认的22端口,修改环境配置文件
hadoop-2.3.0/etc/hadoop/hadoop-env.sh
追加一行
export HADOOP_SSH_OPTS="-p <num>"
The authenticity of host 'localhost (127.0.0.1)' can't be established.
SSH认证文件权限不正确
切换到su hadoop 到/home/hadoop目录
执行
chmod 644 authorized_keys cd ~ chmod 700 .ssh
创建 hadoop-2.3.0/tmp目录
mkdir tmp
修改hadoop-2.3.0/etc/hadoop/mapred-site.xml
修改四个配置文件
#hadoop-2.3.0/etc/hadoop/core-site.xml <configuration> <property> <name>hadoop.tmp.dir</name> <value>/data01/javaapp/hadoop-2.3.0/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> </property> </configuration>
#vi etc/hadoop/mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:54311</value> </property> <property> <name> mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
#vi etc/hadoop/hdfs-site.xml <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> </configuration>
#vi etc/hadoop/yarn-site.xml <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
hadoop启动和停止
格式化HDFS文件系统
$HADOOP_HOME/bin/hadoop namenode -format
启动系统
$HADOOP_HOME/sbin/hadoop-daemon.sh start namenode $HADOOP_HOME/sbin/hadoop-daemon.sh start datanode $HADOOP_HOME/sbin/hadoop-daemon.sh start secondarynamenode $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager $HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
查看运行日志:
引用
less logs/hadoop-hadoop-datanode-UAT.log
停止系统
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver $HADOOP_HOME/sbin/yarn-daemon.sh stop nodemanager $HADOOP_HOME/sbin/yarn-daemon.sh stop resourcemanager $HADOOP_HOME/sbin/hadoop-daemon.sh stop secondarynamenode $HADOOP_HOME/sbin/hadoop-daemon.sh stop datanode $HADOOP_HOME/sbin/hadoop-daemon.sh stop namenode
查看Java虚拟机里运行的hadoop进程
# Java Virtual Machine Process Status Tool (jps) acts like a ps command and # shows various java process running. Hence running would show all hadoop processes. hduser@diaryfoliovm:/opt/hadoop$ jps 2243 TaskTracker 2314 JobTracker 1923 DataNode 2895 SecondaryNameNode 1234 Jps 1788 NameNode
运行wordcount实例
创建空文件
vi hadoop-2.3.0/tmp/input
复制以下内容进去
Read: Father of Santa Barbara Victim Sobs and Rails Against Son's Death That "last chance" turned bleak – a night that reflected his ambitions, fury and warped perspectives. It became a flashpoint leading up to last Friday's attacks that left six others dead and 13 injured. WATCH: Massive Crowd Comes to Mourn Santa Barbara Shooting Victims Rodger bought a bottle of vodka that night, taking a few shots for courage, maybe downing one too many. Other students were partying – "good looking popular kids," as he identified them. Without the buzz, he would have been too intimidated to mingle.
启动hadoop,
$HADOOP_HOME/sbin/hadoop-daemon.sh start namenode $HADOOP_HOME/sbin/hadoop-daemon.sh start datanode $HADOOP_HOME/sbin/hadoop-daemon.sh start secondarynamenode $HADOOP_HOME/sbin/yarn-daemon.sh start resourcemanager $HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
执行wordcount命令
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar wordcount file:///data01/javaapp/hadoop-2.3.0/tmp/input output2
查看生成的文件
bin/hadoop fs -ls -R output2
14/05/28 14:06:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable -rw-r--r-- 1 hadoop supergroup 0 2014-05-28 13:58 output2/_SUCCESS -rw-r--r-- 1 hadoop supergroup 4574 2014-05-28 13:58 output2/part-r-00000
查看统计结果
bin/hadoop fs -cat output2/part-r-00000
删除已生成的结果文件
bin/hadoop fs -rmr output*
运行状态:
bin/hadoop dfsadmin -report
webl界面
http://192.168.1.22:50075/dataNodeHome.jsp
hfs文件管理
http://192.168.1.22:50070/dfshealth.html
http://192.168.1.22:50090/status.jsp
节点管理
http://192.168.1.22:8042/node
应用管理
http://192.168.1.22/node:8088
hadoop缺省端口
http://hsrong.iteye.com/blog/1374734
hadoop2.3集群