Sqoop知识点总结

Sqoop

Sqoop简介

Sqoop是一个用于在Hadoop和关系数据库之间传输数据的工具

导入:从关系型数据库到Hadoop平台

导出:从Hadoop平台到关系型数据库

本质:将导入导出命令翻译成MapReduce程序并行执行

场景:常用于数仓中业务数据导入

Sqoop命令

导入:sqoop import

  • 常规导入

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --driver com.mysql.jdbc.Driver --table customers --username root --password root --target-dir /data/retail_db/customers --m 3
  • where语句

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --where “order_id < 500” --username root --password root --delete-target-dir --target-dir /data/orders --m 3
  • COLUMNS过滤

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --columns “order_id,order_date,order_customer_id” --username root --password root --delete-target-dir --target-dir /data/orders --m 3
  • query方式导入

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --query “select * from orders where order_status != ‘CLOSED’ and $CONDITIONS” --username root --password root --split-by order_status --delete-target-dir --target-dir /data/ --m 3
  • 增量导入

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --username root --password root --incremental append --check-column order_date --last-value ‘2014-04-15’ --target-dir /data/orders --m 3
  • 导入Hive

    • 普通表

      • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --table orders --username root --password root --hive-import --create-hive-table --hive-database retail_db --hive-table orders --m 3
    • 分区表

      • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --driver com.mysql.jdbc.Driver --query “select order_id, order_status from orders where order_date >= ‘2013-11-03’ and order_date < ‘2013-11-04’ and $CONDITIONS” --username root --password root --target-dir /data/retail_db/orders –delete-target-dir --split-by order_status --hive-import --hive-table retail_db.orders --hive-partition-key “order_date” --hive-partition-value ‘2013-11-03’ --m 1
  • 导入HBase

    • sqoop import --connect jdbc:mysql://hadoop102:3306/retail_db --username root --password root --driver com.mysql.jdbc.Driver --table products --columns “product_id, product_name, product_description, product_price, product_image” --hbase-table products --column-family data --hbase-row-key product_id --m 3

导出:sqoop export

  • sqoop export
    –connect jdbc:mysql://hadoop102:3306/sqoop
    –username root
    –password root
    –table customers_demo
    –export-dir /customerinput
    -m 1

执行脚本:sqoop --options-file

  • 步骤

    • 创建文件

      • touch job_RDBMS2HDFS.opt
    • 编辑脚本

      • import
        –connect
        jdbc:mysql://hadoop102:3306/retail_db
        –driver
        com.mysql.jdbc.Driver
        –table customers
        –username root
        –password root
        –target-dir
        /data/retail_db/customers
        –m 3
    • 执行脚本

      • sqoop --options-file job_HDFS2RDBMS.opt

猜你喜欢

转载自blog.csdn.net/m0_48758256/article/details/109065126