(1)hadoop2.7.1源码编译	http://aperise.iteye.com/blog/2246856
(2)hadoop2.7.1安装准备	http://aperise.iteye.com/blog/2253544
(3)1.x和2.x都支持的集群安装	http://aperise.iteye.com/blog/2245547
(4)hbase安装准备	http://aperise.iteye.com/blog/2254451
(5)hbase安装	http://aperise.iteye.com/blog/2254460
(6)snappy安装	http://aperise.iteye.com/blog/2254487
(7)hbase性能优化	http://aperise.iteye.com/blog/2282670
(8)雅虎YCSBC测试hbase性能测试	http://aperise.iteye.com/blog/2248863
(9)spring-hadoop实战	http://aperise.iteye.com/blog/2254491
(10)基于ZK的Hadoop HA集群安装	http://aperise.iteye.com/blog/2305809

本文章主要解决以下几个问题：

（1）设置linux打开文件数和进程数

（2）防火墙设置

（3）设置主机名

（4）配置/etc/hosts

（5）下载并安装JDK

（6）创建用户用于hadoop安装,配置SSH免密码登录

1.设置linux打开文件数和进程数

1)修改/etc/security/limits.conf ，在最后增加如下内容：

* soft nofile 102400
* hard nofile 409600
* soft nproc 409600
* hard nproc 819200

2)修改/etc/pam.d/login，在最后添加如下内容：

session required /lib/security/pam_limits.so

3)重启系统使得配置生效

扫描二维码关注公众号，回复： 402528 查看本文章

4)linux官方关于该值设置建议

该值设置多少合适，目前在linux帮助文档(man limits.conf)中没找到合理的估算方法和推荐值，该文档中不过说到一句“All items support the values -1, unlimited or infinity indicating no limit, except for priority and nice”，意思是除了priority和nice两项外，其他项均可以设置值为-1、unlimited和infinity，这样就不受限制,但不要这么干，这样修改后系统无法启动，详细说明见博客

http://www.cnblogs.com/zengkefu/p/5635153.html

NAME
       limits.conf - configuration file for the pam_limits module

DESCRIPTION
       The syntax of the lines is as follows:

       <domain><type><item><value>

       The fields listed above should be filled as follows:

       <domain>
              ·  a username
              ·  a groupname, with @group syntax. This should not be confused with netgroups.
              ·  the wildcard *, for default entry.
              ·  the wildcard %, for maxlogins limit only, can also be used with %group syntax.

       <type>
              hard   for enforcing hard resource limits. These limits are set by the superuser and enforced by the Kernel. The user cannot raise his
                     requirement of system resources above such values.
              soft   for enforcing soft resource limits. These limits are ones that the user can move up or down within the permitted range by any
                     pre-exisiting hard limits. The values specified with this token can be thought of as default values, for normal system usage.
              -      for enforcing both soft and hard resource limits together.
                     Note, if you specify a type of ’-’ but neglect to supply the item and value fields then the module will never enforce any limits
                     on the specified user/group etc. .

       <item>
              core   limits the core file size (KB)

              data   maximum data size (KB)

              fsize  maximum filesize (KB)

              memlock
                     maximum locked-in-memory address space (KB)

              nofile maximum number of open files

              rss    maximum resident set size (KB) (Ignored in Linux 2.4.30 and higher)

              stack  maximum stack size (KB)

              cpu    maximum CPU time (minutes)

              nproc  maximum number of processes

              as     address space limit

              maxlogins
                     maximum number of logins for this user

              maxsyslogins
                     maximum number of logins on system

              priority
                     the priority to run user process with (negative values boost process priority)

              locks  maximum locked files (Linux 2.4 and higher)

              sigpending
                     maximum number of pending signals (Linux 2.6 and higher)

              msqqueue
                     maximum memory used by POSIX message queues (bytes) (Linux 2.6 and higher)

              nice   maximum nice priority allowed to raise to (Linux 2.6.12 and higher)

              rtprio maximum realtime priority allowed for non-privileged processes (Linux 2.6.12 and higher)

       In general, individual limits have priority over group limits, so if you impose no limits for admin group, but one of the members in this group
       have a limits line, the user will have its limits set according to this line.

       Also, please note that all limit settings are set per login. They are not global, nor are they permanent; existing only for the duration of the
       session.

       In the limits configuration file, the ’#’ character introduces a comment - after which the rest of the line is ignored.

       The pam_limits module does its best to report configuration problems found in its configuration file via syslog(3).

EXAMPLES
       These are some example lines which might be specified in /etc/security/limits.conf.

       *               soft    core            0
       *               hard    rss             10000
       @student        hard    nproc           20
       @faculty        soft    nproc           20
       @faculty        hard    nproc           50
       ftp             hard    nproc           0
       @student        -       maxlogins       4

SEE ALSO
       pam_limits(8), pam.d(5), pam(8)

AUTHOR
       pam_limits was initially written by Cristian Gafton <[email protected]>

2.防火墙设置

1) 重启后生效

开启： chkconfig iptables on

关闭： chkconfig iptables off

2) 即时生效，重启后失效

开启： service iptables start

关闭： service iptables stop

3) vi /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing #注释掉
SELINUX=disabled #新增
# SELINUXTYPE= can take one of three two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
#SELINUXTYPE=targeted #注释掉

4) selinux修改立即生效

setenforce 0

3.设置主机名

vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=nmsc0

4.配置/etc/hosts

vi /etc/hosts

127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.181.66 nmsc0
192.168.88.21 nmsc1
192.168.88.22 nmsc2

5.下载并安装JDK

#1.首先不建议用openjdk，建议采用oracle官网JDK

#2.首先卸载系统自带的低版本或者自带openjdk
#首先用命令java -version 查看系统中原有的java版本
#然后用用 rpm -qa | gcj 命令查看具体的信息
#最后用 rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64卸载

#3.安装jdk-7u65-linux-x64.gz
#下载jdk-7u65-linux-x64.gz放置于/opt/java/jdk-7u65-linux-x64.gz并解压
cd /opt/java/
tar -zxvf jdk-7u65-linux-x64.gz
#配置linux系统环境变量
vi /etc/profile
#在文件末尾追加如下内容
export JAVA_HOME=/opt/java/jdk1.7.0_65
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#使配置生效
source /etc/profile

#4检查JDK环境是否配置成功
java -version

6.创建用户用于hadoop安装,配置SSH免密码登录

1)创建用户用于Hadoop安装

#删除已经存在的hadoop用户，并且删除目录/home/hadoop目录
userdel -r hadoop
#创建用户hadoop
useradd hadoop
#设置用户hadoop密码
passwd hadoop

2)配置SSH免密码登录

这里我三台机器IP为(192.168.181.66 192.168.88.21 192.168.88.22),以下是我在机器192.168.181.66上执行的命令：

#首先切换到上面的hadoop用户
su - hadoop
#生成非对称公钥和私钥，这个在集群中所有节点都必须执行
ssh-keygen -t rsa
#通过ssh登录远程机器时，本机会默认将当前用户目录下的.ssh/authorized_keys带到远程机器进行验证，这里是/home/hadoop/.ssh/authorized_keys中公钥(来自其他机器上的/home/hadoop/.ssh/id_rsa.pub.pub),以下代码只在主节点执行就可以做到主从节点之间SSH免密码登录
cd /home/hadoop/.ssh/
#首先将Master节点的公钥添加到authorized_keys
cat id_rsa.pub>>authorized_keys
#其次将Slaves节点的公钥添加到authorized_keys
ssh [email protected] cat /home/hadoop/.ssh/id_rsa.pub>> authorized_keys
ssh [email protected] cat /home/hadoop/.ssh/id_rsa.pub>> authorized_keys
#这里将Master节点的authorized_keys分发到其他slaves节点
scp -r /home/hadoop/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/
scp -r /home/hadoop/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/
#必须设置修改/home/hadoop/.ssh/authorized_keys权限
chmod 600 /home/hadoop/.ssh/authorized_keys
#免密码远程登录nmsc1
ssh nmsc1

3) ssh服务相关命令

#查看openssh版本
ssh -V
#查看openssl版本
openssl version -a
#重启ssh服务
/etc/rc.d/init.d/sshd restart
#通过ssh登录远程机器nmsc1
ssh nmsc1 或者 ssh hadoop@nmsc1
#查看ssh登录远程机器nmsc1的debug信息
ssh -v2 nmsc1

Hadoop2.7.1+Hbase1.2.1集群环境搭建(2)hadoop2.7.1安装准备