Sqoop
Sqoop简介
Sqoop是一个用于在Hadoop和关系数据库之间传输数据的工具
导入:从关系型数据库到Hadoop平台
导出:从Hadoop平台到关系型数据库
本质:将导入导出命令翻译成MapReduce程序并行执行
场景:常用于数仓中业务数据导入
Sqoop命令
导入:sqoop import
-
常规导入
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --driver com.mysql.jdbc.Driver --table customers --username root --password root --target-dir /data/retail_db/customers --m 3
-
where语句
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --where “order_id < 500” --username root --password root --delete-target-dir --target-dir /data/orders --m 3
-
COLUMNS过滤
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --columns “order_id,order_date,order_customer_id” --username root --password root --delete-target-dir --target-dir /data/orders --m 3
-
query方式导入
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --query “select * from orders where order_status != ‘CLOSED’ and $CONDITIONS” --username root --password root --split-by order_status --delete-target-dir --target-dir /data/ --m 3
-
增量导入
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --username root --password root --incremental append --check-column order_date --last-value ‘2014-04-15’ --target-dir /data/orders --m 3
-
导入Hive
-
普通表
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --username root --password root --hive-import --create-hive-table --hive-database retail_db --hive-table orders --m 3
-
分区表
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --driver com.mysql.jdbc.Driver --query “select order_id, order_status from orders where order_date >= ‘2013-11-03’ and order_date < ‘2013-11-04’ and $CONDITIONS” --username root --password root --target-dir /data/retail_db/orders –delete-target-dir --split-by order_status --hive-import --hive-table retail_db.orders --hive-partition-key “order_date” --hive-partition-value ‘2013-11-03’ --m 1
-
-
导入HBase
- sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --username root --password root --driver com.mysql.jdbc.Driver --table products --columns “product_id, product_name, product_description, product_price, product_image” --hbase-table products --column-family data --hbase-row-key product_id --m 3
导出:sqoop export
- sqoop export
–connect jdbc:mysql://hadoop102:3306/sqoop
–username root
–password root
–table customers_demo
–export-dir /customerinput
-m 1
执行脚本:sqoop --options-file
-
步骤
-
创建文件
- touch job_RDBMS2HDFS.opt
-
编辑脚本
- import
–connect
jdbc:mysql://hadoop102:3306/retail_db
–driver
com.mysql.jdbc.Driver
–table customers
–username root
–password root
–target-dir
/data/retail_db/customers
–m 3
- import
-
执行脚本
- sqoop --options-file job_HDFS2RDBMS.opt
-