在对象列表上进行排序和分组

huangapple go评论75阅读模式
英文:

Sorting and Grouping on a list of objects

问题

以下是翻译好的内容:

我有一个如下所示的Procedure对象列表:

Procedure1	01/01/2020
Procedure2	03/01/2020
Procedure3	03/01/2020
Procedure1	04/01/2020
Procedure5	05/01/2020, 02/01/2020
Procedure2	06/01/2020

而我的Procedure类如下:

类 Procedure {
	List<Date> procedureDate;
	String procedureName;
}

我想根据以下条件对对象进行排序和分组。

1)所有程序应基于过程名称分组。
2)流程必须按流程日期降序排列。[日期列表中的第一个元素,即procedureDate.get[0]]
3)分组在相同的流程中,应按日期降序排列。

最终结果应为:

Procedure2	06/01/2020
Procedure2	03/01/2020

Procedure5	05/01/2020, 02/01/2020

Procedure1	04/01/2020
Procedure1	01/01/2020

Procedure3	03/01/2020

我能够使用Comparator和旧的Java代码实现这一点。是否可以使用Java 8的流、收集器和分组来实现相同的效果?

英文:

I have a List of Procedure objects as below

Procedure1	01/01/2020
Procedure2	03/01/2020
Procedure3	03/01/2020
Procedure1	04/01/2020
Procedure5	05/01/2020, 02/01/2020
Procedure2	06/01/2020

and my Procedure class is like

Class Procedure {
	List&lt;Date&gt; procedureDate;
	String procedureName;
}

I want to sort and group the objects based on the below conditions.

  1. All procedures should be grouped based on the procedure name.
  2. Procedures must be in descending order of procedure date. [first element in date list i.e., procedureDate.get[0]]
  3. Same Procedures grouped together should be in descending order of Date.

End result must be,

Procedure2	06/01/2020
Procedure2	03/01/2020

Procedure5	05/01/2020, 02/01/2020

Procedure1	04/01/2020
Procedure1	01/01/2020

Procedure3	03/01/2020

I was able to achieve this using Comparator and old java code. Is it possible to achieve the same using java8 streams, collectors and grouping by?

答案1

得分: 2

这是一个非常有趣的问题。解决方案并不像看起来的那么简单。你必须将解决方案分成多个步骤:

  1. 基于List&lt;Date&gt;中的第一个日期,获取每个分组的procedureName的最大值。
  2. 基于步骤一中创建的Map&lt;String, Date中的最大Date值,比较Procedure实例。
  3. 如果它们相等,则通过名称进行区分(例如,两次Procedure 2)。
  4. 如果它们仍然相等,则根据它们实际的第一个日期对Procedure实例进行排序。

以下是演示链接:https://www.jdoodle.com/iembed/v0/Te

步骤1

List&lt;Procedure&gt; procedures = ...

Map&lt;String, Date&gt; map = procedures.stream().collect(
    Collectors.collectingAndThen(
        Collectors.groupingBy(
            Procedure::getProcedureName,
            Collectors.maxBy(Comparator.comparing(s -&gt; s.getProcedureDate().get(0)))),
    s -&gt; s.entrySet().stream()
        .filter(e -&gt; e.getValue().isPresent())
        .collect(Collectors.toMap(
              Map.Entry::getKey,
              e -&gt; e.getValue().get().getProcedureDate().get(0)))));

.. 解释:有一种简单的方法可以获得按procedureName分组的具有最大第一个日期的Procedure

Map&lt;String, Optional&lt;Procedure&gt;&gt; mapOfOptionalProcedures = procedures.stream()
    .collect(Collectors.groupingBy(
             Procedure::getProcedureName,
             Collectors.maxBy(Comparator.comparing(o -&gt; o.getProcedureDate().get(0)))));

然而,返回的结构有点繁琐(Map&lt;String, Optional&lt;Procedure&gt;&gt;),为了让它更有用并直接返回Date,需要使用附加的下游收集器Collectors::collectingAndThen,它使用一个Function作为结果映射器:

Map&lt;String, Date&gt; map = procedures.stream().collect(
    Collectors.collectingAndThen(
        /* 分组部分 */,
        s -&gt; s.entrySet().stream()
            .filter(e -&gt; e.getValue().isPresent())
            .collect(Collectors.toMap(
                    Map.Entry::getKey,
                    e -&gt; e.getValue().get().getProcedureDate().get(0)))));

... 这实际上就是第一个片段。

步骤2、3和4

基本上,按每个分组的最大日期排序。然后按名称排序,最后按实际的第一个日期排序。

Collections.sort(
    procedures,
    (l, r) -&gt; {
        int dates = map.get(r.getProcedureName()).compareTo(map.get(l.getProcedureName()));
        if (dates == 0) {
             int names =  l.getProcedureName().compareTo(r.getProcedureName());
             if (names == 0) {
                 return r.getProcedureDate().get(0).compareTo(l.getProcedureDate().get(0));
             } else return names;
        } else return dates;
    }
);

排序后的结果

根据你的问题使用已弃用的java.util.Date,排序后的procedures将会有类似于你期望的片段的排序项(我已经重写了Procedure::toString方法):

@Override
public String toString() {
     return procedureName + " " + procedureDate;
}
Procedure2 [Mon Jan 06 00:00:00 CET 2020]
Procedure2 [Fri Jan 03 00:00:00 CET 2020]
Procedure5 [Sun Jan 05 00:00:00 CET 2020, Thu Jan 02 00:00:00 CET 2020]
Procedure1 [Sat Jan 04 00:00:00 CET 2020]
Procedure1 [Wed Jan 01 00:00:00 CET 2020]
Procedure3 [Fri Jan 03 00:00:00 CET 2020]
英文:

This is a very interesting question. The solution is not as easy as it looks to be. You have to divide the solution into multiple steps:

  1. Get the max value for each grouped procedureName based on the first dates in the List&lt;Date&gt;.
  2. Compare the Procedure instances based on max Date value from the Map&lt;String, Date created in the step one.
  3. If they are equal distinguish them by the name (ex. two times Procedure 2).
  4. If they are still equal, sort the Procedure instances based on their actual first date.

Here is the demo at: https://www.jdoodle.com/iembed/v0/Te.

Step 1

List&lt;Procedure&gt; procedures = ...

Map&lt;String, Date&gt; map = procedures.stream().collect(
    Collectors.collectingAndThen(
        Collectors.groupingBy(
            Procedure::getProcedureName,
            Collectors.maxBy(Comparator.comparing(s -&gt; s.getProcedureDate().get(0)))),
    s -&gt; s.entrySet().stream()
        .filter(e -&gt; e.getValue().isPresent())
        .collect(Collectors.toMap(
              Map.Entry::getKey,
              e -&gt; e.getValue().get().getProcedureDate().get(0)))));

.. explained: There is a simple way to get a Procedure with maximum first date grouped by procedureName.

Map&lt;String, Optional&lt;Procedure&gt;&gt; mapOfOptionalProcedures = procedures.stream()
    .collect(Collectors.groupingBy(
             Procedure::getProcedureName,
             Collectors.maxBy(Comparator.comparing(o -&gt; o.getProcedureDate().get(0)))));

However, the returned structure is a bit clumsy (Map&lt;String, Optional&lt;Procedure&gt;&gt;), to make it useful and return Date directly, there is a need of additional downstream collector Collectors::collectingAndThen which uses a Function as a result mapper:

Map&lt;String, Date&gt; map = procedures.stream().collect(
    Collectors.collectingAndThen(
        /* grouping part */,
        s -&gt; s.entrySet().stream()
            .filter(e -&gt; e.getValue().isPresent())
            .collect(Collectors.toMap(
                    Map.Entry::getKey,
                    e -&gt; e.getValue().get().getProcedureDate().get(0)))));

... which is effectively the first snippet.

Steps 2, 3 and 4

Basically, sort by the maximum date for each group. Then sort by the name and finally by the actual first date.

Collections.sort(
    procedures,
    (l, r) -&gt; {
        int dates = map.get(r.getProcedureName()).compareTo(map.get(l.getProcedureName()));
        if (dates == 0) {
             int names =  l.getProcedureName().compareTo(r.getProcedureName());
             if (names == 0) {
                 return r.getProcedureDate().get(0).compareTo(l.getProcedureDate().get(0));
             } else return names;
        } else return dates;
    }
);

Sorted result

Using the deprecated java.util.Date according to your question, the sorted procedures will have sorted items like your expected snippet (I have overrided the Procedure::toString method)

@Override
public String toString() {
     return procedureName + &quot; &quot; + procedureDate;
}
Procedure2 [Mon Jan 06 00:00:00 CET 2020]
Procedure2 [Fri Jan 03 00:00:00 CET 2020]
Procedure5 [Sun Jan 05 00:00:00 CET 2020, Thu Jan 02 00:00:00 CET 2020]
Procedure1 [Sat Jan 04 00:00:00 CET 2020]
Procedure1 [Wed Jan 01 00:00:00 CET 2020]
Procedure3 [Fri Jan 03 00:00:00 CET 2020]

答案2

得分: 1

我的想法源自函数式编程,其基础是映射-归约(map-reduce)。你可以看到groupBy/collect实际上是一种归约的形式,而这个问题可以通过"合并"来更好地解决,而不是使用Stream的groupBy功能。以下是我在纯Stream中的实现。

List<Procedure> a = List.of(
    new Procedure(...),
    ...
)

List<Procedure> b = a.stream().map((p) -> {                    // 为每个对象创建一个映射以准备归约
    Map<String, Procedure> mapP = new HashMap<>();
    mapP.put(p.getProcedureName(), p);
    return mapP;
}).reduce((p, q) -> {                                         // 使用归约进行合并
    q.entrySet().stream().forEach((qq) -> {
        if (p.containsKey(qq.getKey())) {
            p.get(qq.getKey()).setProcedureDate(
                new ArrayList<>(
                    Stream.concat(
                        p.get(qq.getKey()).getProcedureDate().stream(),
                        qq.getValue().getProcedureDate().stream())
                    .collect(Collectors.toSet())
                )
            );
        } else {
            p.put(qq.getKey(), qq.getValue());
        }
    });

    return p;
}).get().values().stream().map(p -> {                          // 对象内部的日期排序
    p.setProcedureDate(p.getProcedureDate().stream().sorted().collect(Collectors.toList()));
    return p;
}).sorted((x, y) ->                                         // 按第一个日期对对象排序
    x.getProcedureDate().get(0).compareTo(y.getProcedureDate().get(0))
).collect(Collectors.toList());
英文:

My thought is coming from functional programming which is based on map-reduce. You can see groupBy/collect is actually a form of reduce anyway and this problem can be better "merge" rather than using groupBy feature of Stream. This is my implementation in pure Stream.

List&lt;Procedure&gt; a = List.of(
new Procedure(...),
...
)
List&lt;Procedure&gt; b = a.stream().map((p)-&gt; {                    // Prepare for reduce by create Map for each object
Map&lt;String,Procedure&gt; mapP = new HashMap&lt;&gt;();
mapP.put(p.getProcedureName(),p)
return mapP
}).reduce((p,q)-&gt;{                                         //Use reduce to merge
q.entrySet().stream().forEach((qq)-&gt; {
if (p.containsKey(qq.getKey())) {
p.get(qq.getKey()).setProcedureDate(
new ArrayList&lt;Date&gt;(
Stream.concat(
p.get(qq.getKey()).getProcedureDate().stream(),
qq.getValue().getProcedureDate().stream())
.collect(Collectors.toSet()))
);
} else {
p.put(qq.getKey(), qq.getValue());
}
})
return p;
}).get().values().stream().map(p-&gt; {                          //sort date inside object
p.setProcedureDate(p.getProcedureDate().stream().sorted().collect(Collectors.toList()))
return p;
}
).sorted((x,y)-&gt;                                         //sort object by the first date
x.procedureDate.get(0).compareTo(y.procedureDate.get(0))
).collect(Collectors.toList());

huangapple
  • 本文由 发表于 2020年9月11日 16:19:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/63843343.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定