问题:Logstash Aggregate使用聚合,发现数组中的数据会错乱覆盖
原因:过滤器默认是多线程运行,所以聚合数据会错乱
Description
The aim of this filter is to aggregate information available among several events (typically log
lines) belonging to a same task, and finally push aggregated information into final task event.
You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work
correctly otherwise events may be processed out of sequence and unexpected results will occur.
翻译:
说明
此筛选器的目的是在多个事件(通常是日志)中聚合可用的信息
(行)属于同一任务,并最终将聚合信息推送到最终任务事件中。
您应该非常小心地将Logstash filter workers设置为1(-w 1标志),以使此筛选器工作
正确无误,否则事件可能会按顺序处理,并会出现意外结果。
官网有指出需要设置过滤器线程数为1,否则会有问题
解决方案:
启动的时候加上-w 1
示例:
logstash -w 1 -f logstash.conf
聚合示例:
filter {
#这里做聚合
aggregate {
task_id => "%{id}"
code => "
map['id'] = event.get('id')
#input中的type字段,用于判断
map['type'] = event.get('type')
map['name'] = event.get('name')
map['test_list'] ||=[]
map['tests'] ||=[]
#判断是否为空
if (event.get('test_id') != nil)
#用于去重,也可以在sql语句中去重
if !(map['test_list'].include? event.get('test_id'))
map['test_list'] << event.get('test_id')
map['tests'] << {
'test_id' => event.get('test_id'),
'test_name' => event.get('test_name')
}
end
end
event.cancel()
"
push_previous_map_as_event => true
timeout => 5
}
}
参考来源:
Logstash同步mysql一对多数据到ES(踩坑日记系列):
https://blog.csdn.net/menglinjie/article/details/102984845