实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表

                                                                                                        ——你 


    一·需求描述:

要求给出的数据寻找关心的数据,它是对原始数据所包含信息的挖掘。下面进入这个实例。

实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表。

=================样本输入===================

child   parent
Tom   Lucy
Tom   Jack
Jone   Lucy
Jone   Jack
Lucy   Mary
Lucy   Ben
Jack   Alice
Jack   Jesse
Terry   Alice
Terry   Jesse
Philip   Terry
Philip   Alma
Mark   Terry
Mark   Alma

  家族树状关系谱:


=================样本输出===================

grandchild   grandparent
Tom    Alice
Tom    Jesse
Jone    Alice
Jone   Jesse
Tom    Mary
Tom    Ben
Jone     Mary
Jone    Ben
Philip   Alice
Philip    Jesse
Mark    Alice
Mark   Jesse

    二·设计思路:

    取一对样本为例:

        child   parent

        Tom    Lucy

        Lucy    Mary


      mapper代码片段:

        context.write(new Text(values[0]), new Text(values[1]+"_1"));//key是value的小孩 key:Tom    value:Lucy_1

        context.write(new Text(values[1]), new Text(values[0]+"_2"));//key是value的父母 key:Lucy    value:Tom_2

     即mapper读取文件的每一行都输出正反,并进行标记




    三·程序代码:

    mapper.java

package com.company.family;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class FamilyMapper extends Mapper<LongWritable, Text, Text, Text>{
	
	@Override
	protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context)
			throws IOException, InterruptedException {
		//value:"Tom	Lucy"
		String line = value.toString();
		String[] values = line.split("   ");
		context.write(new Text(values[0]), new Text(values[1]+"_1"));//key是value的小孩	key:Tom	value:Lucy_1
		context.write(new Text(values[1]), new Text(values[0]+"_2"));//key是value的父母	key:Lucy value:Tom_2
	}

}
    reducer.java
package com.company.family;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class FamilyReducer extends Reducer<Text, Text, Text, Text>{

	@Override
	protected void reduce(Text key, Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context)
			throws IOException, InterruptedException {
		//key:Lucy values:{Tom_2,Jone_2,Marry_1,Ben_1}
		List<String> yeyelist = new ArrayList<String>();
		List<String> children = new ArrayList<String>();
		for(Text val:values){
			if(val.toString().endsWith("_1")){
				yeyelist.add(val.toString());
			}else if(val.toString().endsWith("_2")){
				children.add(val.toString());
			}
		}
		//Tom	Marry
		//Tom	Ben
		//Jone	Marry
		//Jone	Ben
		for(String child:children){
			for(String yeye:yeyelist){
				context.write(new Text(child.substring(0, child.length()-2)), new Text(yeye.substring(0, yeye.length()-2)));
			}
		}
	}
}
    runner.java
package com.company.family;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;



public class FamilyRunner {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
		
		Configuration conf = new Configuration();		
		Job job = Job.getInstance(conf);		
		//对任务job的描述
		//job的jar路径
		job.setJarByClass(FamilyRunner.class);		
		//job对应的Mapper
		job.setMapperClass(FamilyMapper.class);
		//job的Reducer
		job.setReducerClass(FamilyReducer.class);
		//Mapper的输出类型
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(Text.class);
		//Reducer的输出类型
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Text.class);		
		//job 处理文件路径
		FileInputFormat.setInputPaths(job, new Path("/Users/xuran/Desktop/week"));
		//job 处理之后文件路径
		FileOutputFormat.setOutputPath(job, new Path("/Users/xuran/Desktop/week/result"));
		//提交job
		boolean waitForCompletion = job.waitForCompletion(true);
		
		System.exit(waitForCompletion?0:1);
		
	}
}


最后再贡献出一张图:


猜你喜欢

转载自blog.csdn.net/qq_21870555/article/details/80509639