转自:http://www.hemingliang.site/308.html
阅读目录
查看主题数据分布
[hadoop@m2 kafka_2.10-0.10.2.1]$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test [2017-06-22 15:01:02,628] WARN Connected to an old server; r-o mode will be unavailable (org.apache.zookeeper.ClientCnxnSocket) Topic:test PartitionCount:1 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: 1 Replicas: 1 Isr: 1
Leader:指定主分区的broker id
Replicas: 副本在那些机器上
Isr:可以做为主分区的broker id
由上面可以知道test的分区在broker id为1的机器上,进入kafka_2.10-0.10.2.1/kafka-logs,这个目录是在server.properties中配置的log.dirs指定的目录
当前目录下有一个test-0的目录,日志文件夹的命名规则是 主题名-分区号,进入test-0,内容如下
[hadoop@m2 kafka-logs]$ cd test-0/ [hadoop@m2 test-0]$ ls 00000000000000000000.index 00000000000000000000.log 00000000000000000000.timeindex
可以发现数据文件由.index文件、.log文件、.timeindex文件组成
可以通过kafka安装目录bin目录下的kafka-run-class.sh查看这些文件的内容
查看log文件
[hadoop@m2 test-0]$ ../../bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files 00000000000000000000.index --print-data-log Dumping 00000000000000000000.log Starting offset: 0 offset: 0 position: 0 CreateTime: 1498104812192 isvalid: true payloadsize: 11 magic: 1 compresscodec: NONE crc: 3271928089 payload: hello world offset: 1 position: 45 CreateTime: 1498104813269 isvalid: true payloadsize: 14 magic: 1 compresscodec: NONE crc: 242183772 payload: hello everyone
查看index文件
[hadoop@m2 test-0]$ ../../bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files 00000000000000000000.index --print-data-log Dumping 00000000000000000000.index offset: 0 position: 0
查看timeindex文件
[hadoop@m2 test-0]$ ../../bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files 00000000000000000000.timeindex --print-data-log Dumping 00000000000000000000.timeindex timestamp: 1498104813269 offset: 1 Found timestamp mismatch in :/home/hadoop/apps/kafka_2.10-0.10.2.1/kafka-logs/test-0/00000000000000000000.timeindex Index timestamp: 0, log timestamp: 1498104812192 Found out of order timestamp in :/home/hadoop/apps/kafka_2.10-0.10.2.1/kafka-logs/test-0/00000000000000000000.timeindex Index timestamp: 0, Previously indexed timestamp: 1498104813269
index件和log文件组成segment,segment文件的命名规则是,partion全局的第一个segment从0开始,后续每个segment文件名为上一个全局partion的最大offset(偏移message数)。数值最大为64位long大小,19位数字字符长度,没有数字用0填充。log.segment.bytes参数配置了一个log文件的大小,文件大小超过这个值就会生成新的文件