Oozie Sqoop Action 配置
- Sqoop Action 用来运行sqoop 任务,流程任务必须等当前节点的sqoop任务执行完成之后才能执行后续节点任务。
- Email Action 所有的节点值都可以使用EL表达式
- 运行Sqoop Job,必须在 sqoop action里面配置 job-tracer,name-node,Sqoop command,也许还需要一些参数和配置。
- 同Shell Action一样 Sqoop Action 可以配置成创建或者删除HDFS目录之后再去执行一个Sqoop任务
- Sqoop 应用的配置可以使用job-xml文件中的元素,也可以使用内部元素来配置,像EL表达式也支持在内部元素中的配置,内部元素的配置可以覆盖外部文件中的配置,内部元素配置不能使用 Hadoop mapred.job.tracker and fs.default.name这两个属性
- 跟mr任务一样,在Shell任务中也可以使用文件和附件具体参见【http://archive.cloudera.com/cdh/3/oozie/WorkflowFunctionalSpec.html#a3.2.2.1_Adding_Files_and_Archives_for_the_Job】
Sqoop Action格式
- prepare 元素 如果存在,表明在执行sqoop 命令之前需要执行的一系列 hdfs路径的创建和删除操作,并且路径必须以 hdfs://HOST:PORT 开头
- job-xml 元素 如果存在,则作为sqoop任务的配置文件,从 schema 0.3开始支持多个job-xml元素用来支持多个job.xml文件
- configuration 用来给sqoop任务传递参数
sqoop command
- sqoop 命令可以通过command元素或者多个arg元素指定
- 当使用command的时候,oozie会根据空格把命令切分成多个参数
- 当使用arg的时候,oozie将会把arg里面的值当成参数传递给sqoop
- 当一个参数里面有空格的时候,必须用arg来指定
- 上述所有的元素值都可以使用EL表达式配置
Sqoop Action 使用实例一:使用sqoop同步mysql数据,执行成功发送提示邮件
1,新建 job.properties
2,workflow.xml
3,首先在本地的测试节点上创建文件夹
mkdir -p /opt/mydata/user/oozie/xwj_test/apps/shell/
sqoop_email
4,在hdfs上创建目录 hdfs dfs -mkdir -p /user/oozie/xwj_test/apps/shell/
sqoop_email
5,将上述文件上传到新建好的目录中
cd /opt/mydata/user/oozie/xwj_test/apps/shell/
sqoop_email
6,将本地文件 上传到hdfs目录中
hdfs dfs -put ../
sqoop_email/* /user/oozie/xwj_test/apps/shell/
sqoop_email
7,查看hdfs上的目录文件是否存在
hdfs dfs -ls -r /user/oozie/xwj_test/apps/shell/
sqoop_email
8,切换yarn用户重新提交任务
su yarn
oozie job -oozie http://hadoop-node0.novalocal:11000/oozie -config /opt/mydata/user/oozie/xwj_test/apps/shell/
sqoop_email/job.properties -run
执行结果报错
ACTION[0000002-180412152846094-oozie-root-W@sqoop-node] Launcher exception: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SqoopMain not found
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SqoopMain not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2241)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:238)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: Class org.apache.oozie.action.hadoop.SqoopMain not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2147)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2239)
... 9 more
参考链接
修改job.properties 加入
oozie.use.system.libpath=true
重新运行
结果报错
2018-04-13 09:19:00,121 WARN SqoopActionExecutor:523 - SERVER[hadoop-node0.novalocal] USER[root] GROUP[-] TOKEN[] APP[email-wf] JOB[0000010-180412152846094-oozie-root-W] ACTION[0000010-180412152846094-oozie-root-W@sqoop-node] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]
9 ,遇到这个问题直接百度有很多人遇到相关问题,但是解决的办法很少,这里我们记录一下排查过程
9.1 首先根据 oozie启动的任务ID 到oozie界面上找到 该任务的错误详情
9.2 点击错误的节点 查看节点执行的详情日志
9.3 最终层层定位 ,终于找到错误的真正日志
java.sql.SQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'id,app_name,app_path,user_name from WF_JOBS where (1 = 0)' at line 1 at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:536) at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:513) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:115) at com.mysql.cj.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:1983) at com.mysql.cj.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1826) at com.mysql.cj.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1923) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:777) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260) at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:253) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:337) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1853) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1653) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:179) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:240) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
9.4 通过日志信息发现是 sqoop 导入mysql的 sql 格式有问题 没有添加 select ,修正之后 重新提交 终于运行成功 (
这个地方虽然是一个比较粗心的错误 ,但是通过这个错误找到排查具体日志的方法,非常重要,对于研发来说里程牌式的意义)