Java Stream API中多个和单个filter有性能差异吗?

背景:

在做发布前code review时,看到下面这段代码

workShiftOrderDTOS = workShiftOrderDTOS.stream()
        .filter(x -> x.getShiftDO().isFullDayWorkShift() == false)
        .collect(Collectors.toList());

开发同学新增了1个筛选条件

workShiftOrderDTOS = workShiftOrderDTOS.stream()
        .filter(x -> x.getShiftDO().isFullDayWorkShift() == false)
        .filter(x -> x.getShiftDO().canViewShift())
        .collect(Collectors.toList());

思考

由上面的code review,思考下面2段对集合进行filter的代码,返回结果是一致的,性能上有差异嘛?

    @Data
    class Employee {
        private String gender;
        private Integer age;
    }

    public static void main(String[] args) {
        List<Employee> employees = Lists.newArrayList();

        // >1个filter
        employees.stream()
            .filter(employee -> employee.getAge() > 32)
            .filter(employee -> "male".equals(employee.getGender()));
        
        // 1个filter
        employees.stream()
            .filter(employee ->
                employee.getAge() > 32
                    && "male".equals(employee.getGender()));
    }

我们先看下编译后的class,可以看到编译器并未做任何优化:


于是写了4段测试代码作为对照组,执行对比耗时:
1 filter:

public static void main(String[] args) {
    List<Employee> employees = Lists.newArrayList();

    Random rand = new Random();

    for (int i = 1; i <= 100000; i++) {
        Employee employee = new Employee();

        employee.setAge(i % 100);
        employee.setGender(i % 2 == 0 ? "male" : "female");

        employees.add(employee);
    }

    StopWatch watch = new StopWatch();
    
    watch.start();
    employees.stream()
        .filter(employee ->
            employee.getAge() > 32
                && "male".equals(employee.getGender()))
        .collect(Collectors.toList());
    watch.stop();
    System.out.println(watch.getTotalTimeMillis());
}

2 filters:

public static void main(String[] args) {
    List<Employee> employees = Lists.newArrayList();

    Random rand = new Random();

    for (int i = 1; i <= 100000; i++) {
        Employee employee = new Employee();

        employee.setAge(i % 100);
        employee.setGender(i % 2 == 0 ? "male" : "female");

        employees.add(employee);
    }

    StopWatch watch = new StopWatch();

    watch.start();
    employees.stream()
        .filter(employee -> employee.getAge() > 32)
        .filter(employee -> "male".equals(employee.getGender()))
        .collect(Collectors.toList());
    watch.stop();
    System.out.println(watch.getTotalTimeMillis());
}

1 filter with sorted

public static void main(String[] args) {
    List<Employee> employees = Lists.newArrayList();

    Random rand = new Random();

    for (int i = 1; i <= 100000; i++) {
        Employee employee = new Employee();

        employee.setAge(i % 100);
        employee.setGender(i % 2 == 0 ? "male" : "female");

        employees.add(employee);
    }

    StopWatch watch = new StopWatch();

    watch.start();
    employees.stream()
        .filter(employee ->
            employee.getAge() > 32
                && "male".equals(employee.getGender()))
        .sorted(Comparator.comparingInt(Employee::getAge))
        .collect(Collectors.toList());
    watch.stop();
    System.out.println(watch.getTotalTimeMillis());
}

2 filters with sorted

public static void main(String[] args) {
    List<Employee> employees = Lists.newArrayList();

    Random rand = new Random();

    for (int i = 1; i <= 100000; i++) {
        Employee employee = new Employee();

        employee.setAge(i % 100);
        employee.setGender(i % 2 == 0 ? "male" : "female");

        employees.add(employee);
    }

    StopWatch watch = new StopWatch();

    watch.start();
    employees.stream()
        .filter(employee -> employee.getAge() > 32)
        .sorted(Comparator.comparingInt(Employee::getAge))
        .filter(employee -> "male".equals(employee.getGender()))
        .collect(Collectors.toList());
    watch.stop();
    System.out.println(watch.getTotalTimeMillis());
}

各运行10次,分别去掉最高、最低耗时后求平均值:

对照组1:

  • 1 filter
    86 77 80 87 80 83 95 74 78 85 average=82ms
  • 2 filters
    79 82 79 82 78 78 86 88 77 84 average=81ms

对照组2:

  • 1 filter with sorted
    104 129 103 102 95 97 98 92 90 101 average = 99ms
  • 2 filters with sorted
    112 114 136 113 114 121 112 126 125 111 average = 117ms

发现了什么规律?

  • 1个filter和2个连续的filter性能上并无差异。
  • 当2个filter间增加了sorted操作后,这时候把filter合并性能更优。

这和Stream API的原理推导出来的结论是一致的,有兴趣的可以百度看下Stream API原理(了解stateful op和stateless op、Stage、Sink接口)。

再延伸思考下:
1 filter和3 filters->n filters有性能差异么?(理论应该是没有的,可以实际代码测试下)

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容