how do I use java 8 streams groupingby to calculate average with filters

huangapple go评论74阅读模式
英文:

how do I use java 8 streams groupingby to calculate average with filters

问题

我需要计算一个对象列表的平均值,我正在对其进行流处理。
这些对象具有:

ClassX.id
ClassX.name
ClassX.value
ClassX.startTime
ClassX.endTime

这些对象必须按ClassX.name分组,并使用ClassX.value计算平均值。

流中的每个对象表示事务的开始结束
开始事务具有ClassX.endTime == null
结束事务具有ClassX.startTime == null
结束事务具有ClassX.name == null

要聚合的值位于开始对象中,但仅当流还处理了事务的相应结束对象时才能将其求和到平均值中。

以下是我目前的代码(基于Andreas的建议):

List<ClassX> classXList = ...

Map<String, Double> average = classXListStrings.stream()
        .map(ClassX::new) //将其转换为ClassX(输入列表实际上是String)
        .filter(x -> x.getName() != null) //避免getName的空条目
		.collect(Collectors.groupingBy(ClassX::getName, Collectors.toList()))
		.entrySet().stream()
		// 跳过没有结束事务存在的组
		.filter(e -> e.getValue().stream().anyMatch(x -> x.getStartTime() != null))
		.collect(Collectors.toMap(Entry::getKey,
				e -> e.getValue().stream()
						// 仅平均起始交易的值
						.filter(x -> x.getEndTime() == null)
						.collect(Collectors.averagingDouble(ClassX::getValue))
		));

是否有一种方法可以将流处理的对象存储在数据结构中,然后仅在基于过滤器流处理了事务的开始/结束对象对时才聚合值?

英文:

I need to calculate the average of a List of objects that I'm streaming.
The objects have:

ClassX.id
ClassX.name
ClassX.value
ClassX.startTime
ClassX.endTime

The objects must be grouped by ClassX.name and having the average calculated using ClassX.value.

Each object streamed represents either an start or end of a transaction.
Start transactions has ClassX.endTime == null.
End transactions has ClassX.startTime == null.
End transactions has ClassX.name == null

The value to be aggregated is within the start object, but it must be summed to the average only if the stream also process the corresponding end object of the transation.

Here's what I have so far(based on the suggestion of Andreas):

List<ClassX> classXList = ...

Map<String, Double> average = classXListStrings.stream()
        .map(ClassX::new) //convert to ClassX(the input list is actually String)
        .filter(x -> x.getName() != null) //avoid null entries for getName
		.collect(Collectors.groupingBy(ClassX::getName, Collectors.toList()))
		.entrySet().stream()
		// skip group if no end transaction exists
		.filter(e -> e.getValue().stream().anyMatch(x -> x.getStartTime() != null))
		.collect(Collectors.toMap(Entry::getKey,
				e -> e.getValue().stream()
						// only average values of start transactions
						.filter(x -> x.getEndTime() == null)
						.collect(Collectors.averagingDouble(ClassX::getValue))
		));

Is there a way to maybe store the objects streamed into a data structure and then aggregate the value only if the pair of objects begin/end transaction are streamed based on a filter?

答案1

得分: 1

以下是翻译好的部分:

"要将流中的一个对象与稍后出现的另一个对象关联起来很困难。

一种解决方案是两次遍历列表:首先找到结束事务,将它们收集到一个集合中。然后再次处理列表,计算平均值。

List inputList = ...

Set endSet = inputList.stream()
.filter(o -> o.endTime != null)
.map(o -> o.id)
.collect(Collectors.toSet());

Map<String, Double> average = inputList.stream()
.filter(o -> o.startTime != null && endSet.contains(o.id))
.collect(Collectors.groupingBy(
o -> o.name,
Collectors.averagingDouble(o -> o.value)));"

英文:

It's hard to associate one object in a stream with another that appears later.

One solution is run through the list twice: First you find the end transactions, collecting them to a set. Then you process the list again, computing the averages.

List&lt;ClassX&gt; inputList = ...

Set&lt;String&gt; endSet = inputList.stream()
    .filter(o -&gt; o.endTime != null)
    .map(o -&gt; o.id)
    .collect(Collectors.toSet());

Map&lt;String, Double&gt; average = inputList.stream()
    .filter(o -&gt; o.startTime != null &amp;&amp; endSet.contains(o.id))
    .collect(Collectors.groupingBy(
            o -&gt; o.name, 
            Collectors.averagingDouble(o -&gt; o.value)));

答案2

得分: 0

你可以像这样做:

List<ClassX> classXList = ...

Map<String, Double> average = classXList.stream()
        .collect(Collectors.groupingBy(ClassX::getName, Collectors.toList()))
        .entrySet().stream()
        // skip group if no end transaction exists
        .filter(e -> e.getValue().stream().anyMatch(x -> x.getStartTime() == null))
        .collect(Collectors.toMap(Entry::getKey,
                e -> e.getValue().stream()
                        // only average values of start transactions
                        .filter(x -> x.getEndTime() == null)
                        .collect(Collectors.averagingDouble(ClassX::getValue))
        ));
英文:

You can do it like this:

List&lt;ClassX&gt; classXList = ...

Map&lt;String, Double&gt; average = classXList.stream()
		.collect(Collectors.groupingBy(ClassX::getName, Collectors.toList()))
		.entrySet().stream()
		// skip group if no end transaction exists
		.filter(e -&gt; e.getValue().stream().anyMatch(x -&gt; x.getStartTime() == null))
		.collect(Collectors.toMap(Entry::getKey,
				e -&gt; e.getValue().stream()
						// only average values of start transactions
						.filter(x -&gt; x.getEndTime() == null)
						.collect(Collectors.averagingDouble(ClassX::getValue))
		));

huangapple
  • 本文由 发表于 2020年8月13日 07:40:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/63386058.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定