kafka性能测试之Comsumer

接下来看看消费者的性能测试

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --help
Missing required argument "[topic]"
Option                                   Description                            
------                                   -----------                            
--batch-size <Integer: size>             Number of messages to write in a       
                                           single batch. (default: 200)         
--broker-list <String: host>             REQUIRED (unless old consumer is       
                                           used): A broker list to use for      
                                           connecting if using the new consumer.
--compression-codec <Integer:            If set, messages are sent compressed   
  supported codec: NoCompressionCodec      (default: 0)                         
  as 0, GZIPCompressionCodec as 1,                                              
  SnappyCompressionCodec as 2,                                                  
  LZ4CompressionCodec as 3>                                                     
--consumer.config <String: config file>  Consumer config properties file.       
--date-format <String: date format>      The date format to use for formatting  
                                           the time field. See java.text.       
                                           SimpleDateFormat for options.        
                                           (default: yyyy-MM-dd HH:mm:ss:SSS)   
--fetch-size <Integer: size>             The amount of data to fetch in a       
                                           single request. (default: 1048576)   
--from-latest                            If the consumer does not already have  
                                           an established offset to consume     
                                           from, start with the latest message  
                                           present in the log rather than the   
                                           earliest message.                    
--group <String: gid>                    The group id to consume on. (default:  
                                           perf-consumer-26926)                 
--help                                   Print usage.                           
--hide-header                            If set, skips printing the header for  
                                           the stats                            
--message-size <Integer: size>           The size of each message. (default:    
                                           100)                                 
--messages <Long: count>                 REQUIRED: The number of messages to    
                                           send or consume                      
--new-consumer                           Use the new consumer implementation.   
                                           This is the default.                 
--num-fetch-threads <Integer: count>     Number of fetcher threads. (default: 1)
--reporting-interval <Integer:           Interval in milliseconds at which to   
  interval_ms>                             print progress info. (default: 5000) 
--show-detailed-stats                    If set, stats are reported for each    
                                           reporting interval as configured by  
                                           reporting-interval                   
--socket-buffer-size <Integer: size>     The size of the tcp RECV size.         
                                           (default: 2097152)                   
--threads <Integer: count>               Number of processing threads.          
                                           (default: 10)                        
--topic <String: topic>                  REQUIRED: The topic to consume from.   
--zookeeper <String: urls>               REQUIRED (only when using old          
                                           consumer): The connection string for 
                                           the zookeeper connection in the form 
                                           host:port. Multiple URLS can be      
                                           given to allow fail-over. This       
                                           option is only used with the old     
                                           consumer.

以上是它的参数说明，接下来开始测试。

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --num-fetch-threads 1 --reporting-interval 5000 --threads 10 --topic test003 --messages 1000000
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:14:02:794, 2018-07-02 15:14:08:885, 976.9717, 160.3959, 1000419, 164245.4441

1个拉取线程，10个处理线程，发送100w条，平均是164245.4441/s，消费是160.3959M/s，加大拉取线程，继续

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --num-fetch-threads 10 --reporting-interval 5000 --threads 10 --topic test003 --messages 1000000
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:17:45:956, 2018-07-02 15:17:51:506, 976.9717, 176.0309, 1000419, 180255.6757

将拉取线程也增加到10后，平均是180255.6757/s，消费是176.0309M/s，增加了一些

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --num-fetch-threads 10 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000 
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:19:24:626, 2018-07-02 15:19:30:463, 976.9805, 167.3772, 1000428, 171394.2094

将处理线程增加到20，发现性能并没有提升，可知，瓶颈并不在处理线程，继续调节其他参数

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000 
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:21:32:301, 2018-07-02 15:21:36:755, 976.9717, 219.3470, 1000419, 224611.3606

将拉取线程调到20以后，性能显著提升，连之前平均6秒的消费时间数，都降到了3秒多，处理速度224611.3606/s，处理大小219.3470M/s，由此可见，在消费数据的时候，多增加消费者，对性能的提升帮助很大，当然，要注意你的分区数。

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --batch-size 400 --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:24:15:713, 2018-07-02 15:24:22:712, 976.9805, 139.5886, 1000428, 142938.705

此时再提高batch-size，即每次拉取的数量，性能不升反降，所以，这个参数并不是越大越好，要设置合理的size值才行。

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000 
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:21:32:301, 2018-07-02 15:21:36:755, 976.9717, 219.3470, 1000419, 224611.3606


[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --fetch-size 2000000 --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000                 
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:27:48:063, 2018-07-02 15:27:52:589, 976.8652, 215.8341, 1000310, 221014.1405

再前面的基础上，再次提高fetch-size，默认是1048576，增加到200w，可以发现，性能并没有多大的变化，可以猜测，当前的三台机器是否已经到达瓶颈。

再测试size大小对它的影响。

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092  --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000                 
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:35:29:318, 2018-07-02 15:35:34:346, 976.9805, 194.3080, 1000428, 198971.3604

[root@hadoop-sh1-core1 bin]# ./kafka-consumer-perf-test.sh --broker-list hadoop-sh1-master1:9092,hadoop-sh1-master2:9092,hadoop-sh1-core1:9092 --message-size 25 --num-fetch-threads 20 --reporting-interval 5000 --threads 20 --topic test003 --messages 1000000                     
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec
2018-07-02 15:34:44:607, 2018-07-02 15:34:49:050, 976.9717, 219.8901, 1000419, 225167.4544

默认的size大小是100，将到25以后，发现西能有明显提升，所以，size的大小也是影响的一个因素。

由以上测试可以得出，num-fetch-threads拉取线程数、threads处理线程数影响最大，size和batch-size和fetch-size有一定的影响。

kafka性能测试之Comsumer

猜你喜欢