准备工作:
如一个列为child,一个列为parent,需要找出grandchile, grandParent
如数据
Lucy Tom
Lucy Jone
Tom Heimi
Tom QiQi
Jone Candy
Jone God
Lucy有四个grandParent:Heimi, QiQi,Candy, God
编写程序Mapper程序,Mapper中两次处理输入数据,类似于SQL中的两个表,两个表的关联列变为key,左表用value值加一个前缀1,右表的值加另一个前缀2。
public class SelfMapper extends Mapper<LongWritable, Text, Text, Text>{
enum Counter{
LINESKIP
}
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException{
String line = value.toString();
try {
String []arr = line.split(" ");
context.write(new Text(arr[1]), new Text("1"+arr[0]));
context.write(new Text(arr[0]), new Text("2"+arr[1]));
} catch (Exception e) {
context.getCounter(Counter.LINESKIP).increment(1);
}
}
}
编写Reduce程序,Reduce中对value值做处理,坐标的value和右表的value做组合即可。
public class SelfReducer extends Reducer<Text,Text,Text,Text>{
@Override
public void reduce(Text key, Iterable<Text> values, Context context)throws IOException, InterruptedException{
List<String> childLst = new ArrayList<String>();
List<String> parentLst = new ArrayList<String>();
for(Text t:values){
if(t.toString().substring(0, 1).equals("1")){
childLst.add(t.toString().substring(1, t.toString().length()));
}else if(t.toString().substring(0, 1).equals("2")){
parentLst.add(t.toString().substring(1, t.toString().length()));
}
}
for(String child:childLst){
for(String parent:parentLst){
context.write(new Text(child),new Text(parent));
}
}
}
Map
扫描二维码关注公众号,回复:
4438452 查看本文章
Main程序参考其他的博文。