union用于连接结构相同的表,split可以根据条件拆分表。
1、基础数据:
==============================================
[root@cdh1 data]# cat demodata
xiaoxiao,12,12.1f
aaa,13,1.1f
kjkj,12,12.1f
ddf,19,12.8f
youyou,89,12.3f
==============================================
2、使用union
grunt> A = load '/root/xytest/pig/data/demodata' using PigStorage(',') as (name:chararray,age:int,gpa:float);
grunt> B = load '/root/xytest/pig/data/demodata' using PigStorage(',') as (name:chararray,age:int,gpa:float);
grunt> C = union A,B;
grunt> dump C;
输出结果:
(xiaoxiao,12,12.1)
(aaa,13,1.1)
(kjkj,12,12.1)
(ddf,19,12.8)
(youyou,89,12.3)
(xiaoxiao,12,12.1)
(aaa,13,1.1)
(kjkj,12,12.1)
(ddf,19,12.8)
(youyou,89,12.3)
3、使用split
grunt> A = load '/root/xytest/pig/data/demodata' using PigStorage(',') as (name:chararray,age:int,gpa:float);
grunt> split A into D if age < 18,E if age >=18;
grunt> dump D;
输出结果:
(xiaoxiao,12,12.1)
(aaa,13,1.1)
(kjkj,12,12.1)
grunt> dump E;
输出结果:
(ddf,19,12.8)
(youyou,89,12.3)