Sqoop2是同步关系型数据库数据、Hdfs系统数据的工具。
下载Sqoop2的安装包sqoop-1.99.7-bin-hadoop200.tar.gz
下载地址http://mirrors.hust.edu.cn/apache/sqoop/
1. 解压安装包
tar -xvf sqoop-1.99.7-bin-hadoop200.tar.gz
2. Hadoop相关配置
Sqoop2需要HADOOP_HOME环境变量,Sqoop2会默认寻找下边文件下的jar文件
$HADOOP_HOME/share/hadoop/common, $HADOOP_HOME/share/hadoop/hdfs, $HADOOP_HOME/share/hadoop/mapreduce and $HADOOP_HOME/share/hadoop/yarn
修改hadoop的配置文件core-site.xml配置代理访问权限
<property>
<name>hadoop.proxyuser.hadoop1.groups</name>
<value>*</value>
<description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
</property>
<property>
<name>hadoop.proxyuser.hadoop1.hosts</name>
<value>*</value>
<description>The superuser can connect only from host1 and host2 to impersonate a user</description>
</property>
其中hadoop.proxyuser.hadoop1.groups,hadoop.proxyuser.hadoop1.hosts 中的hadoop1是当前用户名称,需要替换成你自己的用户名称。
3. 设置环境变量
export SQOOP_HOME=/usr/local/sqoop-1.99.7
export LOGDIR=$SQOOP_HOME/logs
export PATH=.:$SQOOP_HOME/bin:$PATH
export SQOOP_SERVER_EXTRA_LIB=/usr/local/sqoop-1.99.7/lib
SQOOP_SERVER_EXTRA_LIB为第三方jar包放置的文件目录例如,mysql-connector-java-5.1.40-bin.jar、sqljdbc41.jar
4. 修改配置文件/conf/sqoop.properties配置
# The following tokens are used in this configuration file:
#
export LOGDIR=/usr/local/sqoop-1.99.7/logs
# The absolute path to the directory where system genearated
# log files will be kept.
#
BASEDIR=/usr/local/sqoop-1.99.7/baseDir
# The absolute path to the directory where Sqoop 2 is installed
#
# Hadoop configuration directory
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/hadoop-2.7.3/etc/hadoop/
5. 初始化Sqoop2服务
Sqoop2服务在启动时需要先初始化
sqoop2-tool upgrade
使用verify工具检查sqoop2的配置是否正确,如果不正确可以先初始化在执行一次检查命令
sqoop2-tool verify
6. 启停Sqoop2服务
sqoop2-server start
sqoop2-server stop
7. Sqoop2客户端Shell命令启动
sqoop2-shell