第一步:在windows系统中安装linux虚拟机
1、 首先安装VMwareWorkstation,我在本地使用12.5.6版本;
2、 下载centos6.3镜像,CentOS-6.3-x86_64-bin-DVD1.iso;
3、 创建好虚拟机
第二步:配置网络
1、设置本地网卡的IP等信息
2、设置虚拟机网络
(1)修改ifcfg-eth0文件信息
命令如下:
修改信息如下:
修改完成后重启网络:service network restart
关闭防火墙:service iptables stop(临时)
chkconfig iptables off(永久)
3、 修改hosts文件
添加meritdata主机所对应的ip地址
第三步:安装jdk(root用户)
1、使用ftp将提前装备好的文件上传至/bigdata文件夹下
2、解压jdk:tar -vxf jdk-7u79-linux-x64.tar.gz
3、建立软连接:ln -s /bigdata/jdk1.7.0_79 java
创建好后查看如下:
4、配置jdk的环境变量
vi /etc/profile
增加如下代码:
exportJAVA_HOME=/bigdata/java
export CLASSPATH=/bigdata/java/lib
export PATH=$JAVA_HOME/bin:$PATH
source/etc/profile
5、查看jdk是否安装成功
第四步:新建用户
Su -root
adduser bigdata
passwd bigdata
cd /
chown -R bigdata:bigdata /bigdata/
第五步:设置SSH免密登录(bigdata用户)
进入到港创建的用户下:
su – bigdata
生成公钥和私钥:
ssh-keygen-t rsa
将公钥导入本机:
cd~/.ssh
catid_rsa.pub > authorized_keys
更改权限:
chmod600 authorized_keys
测试:
ssh host,第一次登录可能需要yes确认,之后就可以直接登录了。
第六步:安装hadoop(bigdata账户下)
1、解压hadoop
登录bigdata用户:su -bigdata
进入/bigdata目录下:cd /bigdata
解压hadoop: tar -vxf hadoop-2.6.0-cdh5.8.2.tar.gz
解压hive: tar -vxf hive-1.1.0-cdh5.8.2.tar.gz
创建hadoop软链接: ln -s hadoop-2.6.0-cdh5.8.2/ hadoop
创建hive软链接: ln -s hive-1.1.0-cdh5.8.2/ hive
2、配置环境变量
编辑profile文件:
vi ~/.bash_profile
增加java及hadoop配置信息:
export JAVA_HOME=/bigdata/java
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/bigdata/hadoop
export PATH=$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH
将hadoop配置在/etc/profile文件中:
切换到root用户下
vi /etc/profile
增加:export$HADOOP_HOME=/bigdata/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
3、修改hadoop相关xml文件
进入cd/bigdata/hadoop/etc/Hadoop目录下
(1) vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://meritdata(主机名):8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/bigdata/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.$SERVER_USER.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.$SERVER_USER.groups</name>
<value>*</value>
</property>
</configuration>
(2) vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
(3) vi mapred-site.xml
文件夹下没有mapred-site.xml文件,故使用如下命令:
cp mapred-site.xml.template mapred-site.xml 创建mapred-site.xml文件
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(4) vi yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>10240</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>92160</value>
</property>
(5) vi hadoop-env.sh
#Licensed to the Apache Software Foundation (ASF) under one
# or morecontributor license agreements. See the NOTICEfile
#distributed with this work for additional information
#regarding copyright ownership. The ASFlicenses this file
# to youunder the Apache License, Version 2.0 (the
#"License"); you may not use this file except in compliance
# withthe License. You may obtain a copy ofthe License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unlessrequired by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUTWARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See theLicense for the specific language governing permissions and
#limitations under the License.
# SetHadoop-specific environment variables here.
# Theonly required environment variable is JAVA_HOME. All others are
#optional. When running a distributedconfiguration it is best to
# setJAVA_HOME in this file, so that it is correctly defined on
# remotenodes.
# Thejava implementation to use.
exportJAVA_HOME=/bigdata/jdk1.7.0_79
# Thejsvc implementation to use. Jsvc is required to run secure datanodes
# thatbind to privileged ports to provide authentication of data transfer
#protocol. Jsvc is not required if SASLis configured for authentication of
# datatransfer protocol using non-privileged ports.
#exportJSVC_HOME=${JSVC_HOME}
exportHADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
# ExtraJava CLASSPATH elements. Automaticallyinsert capacity-scheduler.
for f in$HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
exportHADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
# Themaximum amount of heap to use, in MB. Default is 1000.
#exportHADOOP_HEAPSIZE=
#exportHADOOP_NAMENODE_INIT_HEAPSIZE=""
# ExtraJava runtime options. Empty by default.
exportHADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
# Commandspecific options appended to HADOOP_OPTS when specified
exportHADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS}-Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender}$HADOOP_NAMENODE_OPTS"
exportHADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS$HADOOP_DATANODE_OPTS"
exportHADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS}-Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender}$HADOOP_SECONDARYNAMENODE_OPTS"
exportHADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
exportHADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"
# Thefollowing applies to multiple commands (fs, dfs, fsck, distcp etc)
exportHADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData$HADOOP_JAVA_PLATFORM_OPTS"
# Onsecure datanodes, user to run the datanode as after dropping privileges.
# This**MUST** be uncommented to enable secure HDFS if using privileged ports
# toprovide authentication of data transfer protocol. This **MUST NOT** be
# definedif SASL is configured for authentication of data transfer protocol
# usingnon-privileged ports.
exportHADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
# Wherelog files are stored. $HADOOP_HOME/logsby default.
#exportHADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
# Wherelog files are stored in the secure data environment.
exportHADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
###
# HDFSMover specific parameters
###
# Specifythe JVM options to be used when starting the HDFS Mover.
# Theseoptions will be appended to the options specified as HADOOP_OPTS
# andtherefore may override any similar flags set in HADOOP_OPTS
#
# exportHADOOP_MOVER_OPTS=""
###
#Advanced Users Only!
###
# Thedirectory where pid files are stored. /tmp by default.
# NOTE:this should be set to a directory that can only be written to by
# the user that will run the hadoopdaemons. Otherwise there is the
# potential for a symlink attack.
exportHADOOP_PID_DIR=${HADOOP_PID_DIR}
exportHADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# Astring representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER
4、文件格式化
/bigdata/hadoop/bin/hdfs namenode –format
5、启动hadoop
sbin/start-all.sh
6、可以使用hdfs创建文件夹
创建文件夹:hadoop fs –mkdir /user
查看文件夹是否创建成功:hadoop fs –ls /
监控hadoop:
http://192.168.0.105:8088/cluster
第七步:安装mysql(root用户下)
1、查看系统中是否有默认安装的mysql
不区分大小写查看:rpm -qa |grep -i mysql
删除已有的:rpm -eMySQL-devel-5.5.20-1.el6.x86_64 –nodeps
2、解压mysql
tar -vxf MySQL-5.5.20-1.el6.x86_64.tar
3、rpm方式安装mysql
rpm -ivhMySQL-client-5.5.20-1.el6.x86_64.rpm
rpm -ivhMySQL-devel-5.5.20-1.el6.x86_64.rpm
rpm -ivhMySQL-embedded-5.5.20-1.el6.x86_64.rpm
rpm -ivh MySQL-server-5.5.20-1.el6.x86_64.rpm
rpm -ivh MySQL-shared-5.5.20-1.el6.x86_64.rpm
rpm -ivh MySQL-test-5.5.20-1.el6.x86_64.rpm
4、cp /usr/share/mysql/my-medium.cnf /etc/my.cnf
5、vi /etc/my.cnf
[client]
default-character-set=utf8
[mysqld]
character-set-server=utf8
lower_case_table_names=1
max_connections=1000
max_allowed_packet = 100M
[mysql]
default-character-set=utf8
注:黄颜色部分为增加信息
6、将mysql-connector-java-5.1.41-bin.jar驱动文件放在hive的lib文件夹下
7、启动mysql服务
servicemysql start
8、为mysql的root用户创建密码
创建密码:mysqladmin -uroot password 'root123'
登录mysql:mysql -uroot -p
查看编码:showvariables like 'characte%';
9、创建hive数据库及用户
创建hive数据库:createdatabase hive default character set latin1;
创建hive用户:createuser hive identified by 'hive';
给hive用户授权:grantall privileges on hive.* to 'hive'@'meritdata' identified by 'hive';
grant all privileges on hive.* to 'hive'@'%' identified by 'hive';
刷新:flush privileges;
第八步:配置hive
1、配置hive-site.xml文件
cd /bigdata/hive/conf
vi hive-site.xml
代码如下:
<configuration>
<!--metastore-->
<property>
<name>datanucleus.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://meritdata:3306/hive?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://meritdata:9083</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive</value>
</property>
<!--hiveserver2-->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>meritdata</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
</configuration>
2、配置环境变量
vi~/.bash_profile
增加代码:
exportHIVE_HOME=/bigdata/hive
export PATH=$HIVE_HOME/bin:$PATH
export HIVE_CONF=$HIVE_HOME/conf
export HCAT_HOME=$HIVE_HOME/hcatalog
source~/.bash_profile
3、启动hive
创建logs文件夹:mkdir /bigdata/hive/logs
启动hive:
nohup hive --service metastore> /bigdata/hive/logs/metastore.log 2>&1 &
nohup hive --servicehiveserver2 > /bigdata/hive/logs/hiveserver2.log 2>&1 &
4、执行hive测试
操作和mysql操作基本相同。