(1)hadoop2.7.1源码编译 | http://aperise.iteye.com/blog/2246856 |
(2)hadoop2.7.1安装准备 | http://aperise.iteye.com/blog/2253544 |
(3)1.x和2.x都支持的集群安装 | http://aperise.iteye.com/blog/2245547 |
(4)hbase安装准备 | http://aperise.iteye.com/blog/2254451 |
(5)hbase安装 | http://aperise.iteye.com/blog/2254460 |
(6)snappy安装 | http://aperise.iteye.com/blog/2254487 |
(7)hbase性能优化 | http://aperise.iteye.com/blog/2282670 |
(8)雅虎YCSBC测试hbase性能测试 | http://aperise.iteye.com/blog/2248863 |
(9)spring-hadoop实战 | http://aperise.iteye.com/blog/2254491 |
(10)基于ZK的Hadoop HA集群安装 | http://aperise.iteye.com/blog/2305809 |
本文章主要解决以下几个问题:
(1)设置linux打开文件数和进程数
(2)防火墙设置
(3)设置主机名
(4)配置/etc/hosts
(5)下载并安装JDK
(6)创建用户用于hadoop安装,配置SSH免密码登录
1.设置linux打开文件数和进程数
1)修改/etc/security/limits.conf ,在最后增加如下内容:
* hard nofile 409600
* soft nproc 409600
* hard nproc 819200
2)修改/etc/pam.d/login,在最后添加如下内容:
3)重启系统使得配置生效
4)linux官方关于该值设置建议
该值设置多少合适,目前在linux帮助文档(man limits.conf)中没找到合理的估算方法和推荐值,该文档中不过说到一句“All items support the values -1, unlimited or infinity indicating no limit, except for priority and nice”,意思是除了priority和nice两项外,其他项均可以设置值为-1、unlimited和infinity,这样就不受限制,但不要这么干,这样修改后系统无法启动,详细说明见博客
http://www.cnblogs.com/zengkefu/p/5635153.html
NAME limits.conf - configuration file for the pam_limits module DESCRIPTION The syntax of the lines is as follows: <domain><type><item><value> The fields listed above should be filled as follows: <domain> · a username · a groupname, with @group syntax. This should not be confused with netgroups. · the wildcard *, for default entry. · the wildcard %, for maxlogins limit only, can also be used with %group syntax. <type> hard for enforcing hard resource limits. These limits are set by the superuser and enforced by the Kernel. The user cannot raise his requirement of system resources above such values. soft for enforcing soft resource limits. These limits are ones that the user can move up or down within the permitted range by any pre-exisiting hard limits. The values specified with this token can be thought of as default values, for normal system usage. - for enforcing both soft and hard resource limits together. Note, if you specify a type of ’-’ but neglect to supply the item and value fields then the module will never enforce any limits on the specified user/group etc. . <item> core limits the core file size (KB) data maximum data size (KB) fsize maximum filesize (KB) memlock maximum locked-in-memory address space (KB) nofile maximum number of open files rss maximum resident set size (KB) (Ignored in Linux 2.4.30 and higher) stack maximum stack size (KB) cpu maximum CPU time (minutes) nproc maximum number of processes as address space limit maxlogins maximum number of logins for this user maxsyslogins maximum number of logins on system priority the priority to run user process with (negative values boost process priority) locks maximum locked files (Linux 2.4 and higher) sigpending maximum number of pending signals (Linux 2.6 and higher) msqqueue maximum memory used by POSIX message queues (bytes) (Linux 2.6 and higher) nice maximum nice priority allowed to raise to (Linux 2.6.12 and higher) rtprio maximum realtime priority allowed for non-privileged processes (Linux 2.6.12 and higher) In general, individual limits have priority over group limits, so if you impose no limits for admin group, but one of the members in this group have a limits line, the user will have its limits set according to this line. Also, please note that all limit settings are set per login. They are not global, nor are they permanent; existing only for the duration of the session. In the limits configuration file, the ’#’ character introduces a comment - after which the rest of the line is ignored. The pam_limits module does its best to report configuration problems found in its configuration file via syslog(3). EXAMPLES These are some example lines which might be specified in /etc/security/limits.conf. * soft core 0 * hard rss 10000 @student hard nproc 20 @faculty soft nproc 20 @faculty hard nproc 50 ftp hard nproc 0 @student - maxlogins 4 SEE ALSO pam_limits(8), pam.d(5), pam(8) AUTHOR pam_limits was initially written by Cristian Gafton <[email protected]>
2.防火墙设置
1) 重启后生效
开启: chkconfig iptables on
关闭: chkconfig iptables off
2) 即时生效,重启后失效
开启: service iptables start
关闭: service iptables stop
3) vi /etc/selinux/config
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing #注释掉
SELINUX=disabled #新增
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
#SELINUXTYPE=targeted #注释掉
4) selinux修改立即生效
3.设置主机名
NETWORKING=yes
HOSTNAME=nmsc0
4.配置/etc/hosts
vi /etc/hosts
::1 localhost6.localdomain6 localhost6
192.168.181.66 nmsc0
192.168.88.21 nmsc1
192.168.88.22 nmsc2
5.下载并安装JDK
#2.首先卸载系统自带的低版本或者自带openjdk
#首先用命令java -version 查看系统中原有的java版本
#然后用用 rpm -qa | gcj 命令查看具体的信息
#最后用 rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64卸载
#3.安装jdk-7u65-linux-x64.gz
#下载jdk-7u65-linux-x64.gz放置于/opt/java/jdk-7u65-linux-x64.gz并解压
cd /opt/java/
tar -zxvf jdk-7u65-linux-x64.gz
#配置linux系统环境变量
vi /etc/profile
#在文件末尾追加如下内容
export JAVA_HOME=/opt/java/jdk1.7.0_65
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#使配置生效
source /etc/profile
#4检查JDK环境是否配置成功
java -version
6.创建用户用于hadoop安装,配置SSH免密码登录
1)创建用户用于Hadoop安装
userdel -r hadoop
#创建用户hadoop
useradd hadoop
#设置用户hadoop密码
passwd hadoop
2)配置SSH免密码登录
这里我三台机器IP为(192.168.181.66 192.168.88.21 192.168.88.22),以下是我在机器192.168.181.66上执行的命令:
su - hadoop
#生成非对称公钥和私钥,这个在集群中所有节点都必须执行
ssh-keygen -t rsa
#通过ssh登录远程机器时,本机会默认将当前用户目录下的.ssh/authorized_keys带到远程机器进行验证,这里是/home/hadoop/.ssh/authorized_keys中公钥(来自其他机器上的/home/hadoop/.ssh/id_rsa.pub.pub),以下代码只在主节点执行就可以做到主从节点之间SSH免密码登录
cd /home/hadoop/.ssh/
#首先将Master节点的公钥添加到authorized_keys
cat id_rsa.pub>>authorized_keys
#其次将Slaves节点的公钥添加到authorized_keys
ssh [email protected] cat /home/hadoop/.ssh/id_rsa.pub>> authorized_keys
ssh [email protected] cat /home/hadoop/.ssh/id_rsa.pub>> authorized_keys
#这里将Master节点的authorized_keys分发到其他slaves节点
scp -r /home/hadoop/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/
scp -r /home/hadoop/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/
#必须设置修改/home/hadoop/.ssh/authorized_keys权限
chmod 600 /home/hadoop/.ssh/authorized_keys
#免密码远程登录nmsc1
ssh nmsc1
3) ssh服务相关命令
ssh -V
#查看openssl版本
openssl version -a
#重启ssh服务
/etc/rc.d/init.d/sshd restart
#通过ssh登录远程机器nmsc1
ssh nmsc1 或者 ssh hadoop@nmsc1
#查看ssh登录远程机器nmsc1的debug信息
ssh -v2 nmsc1