大表join优化、代码层代替sql层的 join操作

在开发中，可能经常使用mysql的各种join操作，当数据量小的时候，这种操作很ok。但当数据量达到几百万甚至几千万时，多表连接会出现耗时过大的问题。对于这种问题，可以将大表join操作拆分成小的表查询，再到代码层进行数据的整合，再传递给前端。

比如以下场景：教室和学生的一对多关系

一个教室有多个学生
一个学生只能存在于一个教室

假如我们需要做以下查询：

1. 查出每个教室的所有学生

操作步骤：

查出所有教室
拿到教室的id集合roomIdList
使用in操作，查出属于该roomIdList的所有学生集合studentList
将studentList进行分类，分属到每一个教室下

2. 查出每个学生所属的教室的教室名称

相当于

select student.*,classroom.name from student stu left join classroom room on stu.room_id = room.id;

操作步骤：

查出所有学生
拿到所有学生所属的roomId集合roomIdList，并且进行重复过滤
使用in 查出roomIdList里面的教室classroomList
根据stu.room_id = room.id进行关联，将room.name设置到每一个学生的属性中

以下为具体使用：（模拟部分数据）


    public static void main(String[] args) {
        boolean flag = true;

        List<Student> studentList = new ArrayList<>();
        List<Classroom> classroomList = new ArrayList<>();

        Classroom classroom = new Classroom();
        classroom.setRoomId(111);
        classroom.setRoomName("name+" + 111);
        Classroom classroom2 = new Classroom();
        classroom2.setRoomId(222);
        classroom2.setRoomName("name+" + 222);

        classroomList.add(classroom);
        classroomList.add(classroom2);

        Student student = new Student();
        for (int i = 0; i < 10; i++) {
            student = new Student();
            student.setRoomId(i);
            student.setName("name+" + i);

            if (flag) {
                student.setRoomId(classroom.getRoomId());
            } else {
                student.setRoomId(classroom2.getRoomId());
            }
            flag = !flag;

            studentList.add(student);
        }

        //为每个学生赋值他所在的教室
        MergeUtil.merge(studentList,classroomList,(stu,room) -> stu.getRoomId() == room.getRoomId(),(stu,room) -> stu.setRoomName(room.getRoomName()));

        studentList.forEach((data) -> {
            System.out.println(data.toString());
        });

        //为每个教室，赋值属于他的学生
        classroomList.stream().forEach(room -> {
            List dataList = studentList.stream().filter(student1 -> student1.getRoomId() == room.getRoomId() ).collect(Collectors.toList());
            room.setStudentList(dataList);
        });
//
        System.out.println(classroomList);

        // 使用工具类
        //为每个教室，赋值属于他的学生
        MergeUtil.mergeList(classroomList,studentList,
                (classroom1, student1) -> classroom1.getRoomId() == student1.getRoomId(),
                (classroom1, students) -> classroom1.setStudentList(students));

        System.out.println(classroomList);


    }

如果不使用java8的filter操作，使用for循环自己写，操作如下

 HashMap<Integer ,List<Student>> hashMap = new HashMap<>();
        for (Student stu:studentList){
            Integer roomId = stu.getRoomId();
            List<Student> tempList;
            if (hashMap.containsKey(roomId)){
                tempList = hashMap.get(roomId);
            }else {
                tempList = new ArrayList();
            }
            tempList.add(stu);
            hashMap.put(roomId,tempList);
        }
        for (Classroom room:classroomList){
            room.setStudentList(hashMap.get(room.getRoomId()));
        }

但经过试验，在100w个student的情况下进行测试，使用java8的filter操作耗时比自己for循环+hashMap要快一半左右，所以比较建议使用filter；

最后贴上这部分操作要用到的工具类：

public class MergeUtil {

    /**
     * 把sourceList里面的一些属性合并到targetList里面
     * 基于testFunction的条件,合入逻辑实现为biConsumer
     *
     * @param targetList
     * @param sourceList
     * @param testFunction
     * @param biConsumer
     * @param <T>
     * @param <S>
     */
    public static <T, S> void merge(List<T> targetList, List<S> sourceList, BiFunction<? super T, ? super S, Boolean> testFunction,
                                    BiConsumer<? super T, ? super S> biConsumer) {

        targetList.forEach((t)->{
            Optional<S> optional = sourceList.stream().filter(s->testFunction.apply(t, s)).findFirst();
            if (optional.isPresent()) {
                biConsumer.accept(t, optional.get());
            }
        });

    }

    /**
     * 把sourceList里面的一些item分类合并到targetList的每一个item里面
     *
     * @param targetList
     * @param sourceList
     * @param testFunction
     * @param biConsumer
     * @param <T>
     * @param <S>
     */
    public static <T, S> void mergeList(List<T> targetList, List<S> sourceList, BiFunction<? super T, ? super S, Boolean> testFunction,
                                    BiConsumer<? super T, ? super List<S>> biConsumer) {

        targetList.forEach((t)->{
            List<S> dataList = sourceList.stream().filter(s->testFunction.apply(t, s)).collect(Collectors.toList());
            Optional<List<S>> optional = Optional.of(dataList);
            if (optional.isPresent()) {
                biConsumer.accept(t, optional.get());
            }
        });
    }
}

大表join优化、代码层代替sql层 的 join操作

比如以下场景：教室和学生的一对多关系

猜你喜欢

大表join优化、代码层代替sql层的 join操作