第11章 Spark Streaming整合Flume&Kafka打造通用流处理基础
1.日志到flume环节
编写flume配置文件streaming.conf后
进入flume目录
cd /home/hadoop/app/apache-flume-1.6.0-cdh5.7.0-bin/conf
启动
flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/streaming.conf --name agent1 -Dflume.root.logger=INFO,console
2.flume到Kafka环节
进入zookeeper目录
cd /home/hadoop/app/zookeeper-3.4.5-cdh5.7.0/bin/
启动zookeeper
./zkServer.sh start
启动Kafka
进入目录
cd /home/hadoop/app/kafka_2.11-0.9.0.0/bin
./kafka-server-start.sh -daemon /home/hadoop/app/kafka_2.11-0.9.0.0/config/server.properties
查看有哪些topic
./kafka-topics.sh --list --zookeeper hadoop000:2181
创建topic
./kafka-topics.sh --create --zookeeper hadoop000:2181 --replication-factor 1 --partitions 1 --topic streamingtopic
启动flume
flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/streaming2.conf --name agent1 -Dflume.root.logger=INFO,console
3.kafka到Spark Streaming环节
生产:
./kafka-console-producer.sh --broker-list localhost:9090 --topic kafka_streaming_topic
消费监控:
./kafka-console-consumer.sh --zookeeper hadoop000:2181 --topic kafka_streaming_topic