自定义Flume sink: Sqlserver(一)windows中环境部署

经过一番折腾后终于实现window系统通过flume将txt中的数据抽取到Sqlserver中,现将开发过程分享如下:

windows中环境部署

 (1) jre-8u171-windows-x64.exe

 (2) apache-flume-1.7.0-bin

 (3)编写flume的配置文档client.properts

a1.channels = c1
a1.sources = r1

a1.sinks = k1

a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir = D:/flumetest                           #flume监控目录
a1.sources.r1.fileHeader = true
a1.sources.r1.fileHeaderKey = file
a1.sources.r1.deletePolicy = immediate
a1.sources.r1.recursiveDirectorySearch = true
a1.sources.r1.inputCharset = UTF-8
a1.sources.r1.batchSize = 100
a1.sources.r1.decodeErrorPolicy = IGNORE

a1.sources.r1.deserializer = LINE

a1.channels.c1.type = file
a1.channels.c1.checkpointDir = D:/flumedata/checkpoint          #设置checkpoint的位置,不能放在flume监控的目录里
a1.channels.c1.dataDirs = D:/flumedata/data                             #设置checkpoint的位置,不能放在flume监控的目录里
a1.channels.c1.keep-alive = 1
a1.channels.c1.transactionCapacity = 500                                  #transationCapacity<=capacity
a1.channels.c1.capacity = 500

a1.sinks.k1.type = avro  
a1.sinks.k1.hostname =10.96.183.54
a1.sinks.k1.port = 12345

a1.sinks.k1.channel = c1

(4)flume的log4j.properties

此文件用于设置flume运行时写的日志文件,在cmd启动flume:flume-ng.cmd agent -c ../conf -f ../conf/client.properties -n a1, 结合log4日志使用时不需要添加这个属性  -property flume.root.logger=Error,console,相关日志会存在flume.log.dir设置的目录下。

#flume.root.logger=DEBUG,console
flume.root.logger=ERROR,LOGFILE
flume.log.dir=./logs
flume.log.file=flume.log

log4j.logger.org.apache.flume.lifecycle = INFO
log4j.logger.org.jboss = WARN
log4j.logger.org.mortbay = INFO
log4j.logger.org.apache.avro.ipc.NettyTransceiver = WARN
log4j.logger.org.apache.hadoop = INFO
log4j.logger.org.apache.hadoop.hive = ERROR

# Define the root logger to the system property "flume.root.logger".
log4j.rootLogger=${flume.root.logger}

# Stock log4j rolling file appender
# Default log rotation configuration
log4j.appender.LOGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.LOGFILE.MaxFileSize=100MB
log4j.appender.LOGFILE.MaxBackupIndex=10
log4j.appender.LOGFILE.File=${flume.log.dir}/${flume.log.file}
log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.LOGFILE.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n

log4j.appender.DAILY=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DAILY.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.DAILY.rollingPolicy.ActiveFileName=${flume.log.dir}/${flume.log.file}
log4j.appender.DAILY.rollingPolicy.FileNamePattern=${flume.log.dir}/${flume.log.file}.%d{yyyy-MM-dd}
log4j.appender.DAILY.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILY.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n

# console
# Add "console" to flume.root.logger above if you want to use this
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout

log4j.appender.console.layout.ConversionPattern=%d (%t) [%p - %l] %m%n

(5)将flume在后台运行

1.编写startFlume.bat

@echo  off
cd D:\APP\apache-flume-1.7.0-bin\bin
@echo  off

flume-ng.cmd agent -c ../conf -f ../conf/client.properties -n a1 

2.编写start.vbe文件

CreateObject("WScript.Shell").Run "cmd /c D:\APP\apache-flume-1.7.0-bin\bin\startFlume.bat",0

3.执行start.vbe文件,agent将在后台运行,资源管理器中可以看到有java.exe进程。

注意事项:

1.环境变量中需要配置JAVA_HOME

2.配置windows 环境变量path时,将java路径贴到path值的最前面,避免系统读取其它地方的java.exe,导致flume报错


下一篇讲解自定义flume:sqlserver 源码开发。




猜你喜欢

转载自blog.csdn.net/fengfengchen95/article/details/80820360