--对event和clicks分别取出分组字段,整体属性字段包装起来。 events = foreach events generate opxpid, client_id, TOTUPLE(*) as actual; clicks = foreach clicks generate opxpid, client_id, TOTUPLE(*) as actual; --合并 cstream = union events, clicks; --分组 grpd = group cstream by (opxpid, client_id) parallel 18; --取出分组后的数据流 strmi = foreach grpd generate FLATTEN(cstream.actual); strmi = foreach strmi generate FLATTEN(actual);
pig将多对象按相同属性集合分组
猜你喜欢
转载自schooltop.iteye.com/blog/2109039
今日推荐
周排行