如何在Java 8中使用groupby函数计算计数?

huangapple go评论78阅读模式
英文:

How to calculate count using groupby function in Java 8?

问题

public static void main(String[] args) {

    Map<Integer, Model> test = new HashMap<Integer, Model>();

    Model m1 = new Model("HCL", "Inprogress");
    Model m2 = new Model("HCL", "Cancel");
    Model m3 = new Model("HCL", "Inprogress");
    Model m4 = new Model("HCL", "Completed");

    Model a1 = new Model("TCS", "Inprogress");
    Model a2 = new Model("TCS", "Inprogress");
    Model a3 = new Model("TCS", "Inprogress");
    Model a4 = new Model("TCS", "Completed");
    Model a5 = new Model("TCS", "Completed");

    int count = 1;

    test.put(count++, m1);
    test.put(count++, m2);
    test.put(count++, m3);
    test.put(count++, m4);

    test.put(count++, a1);
    test.put(count++, a2);
    test.put(count++, a3);
    test.put(count++, a4);
    test.put(count, a5);

    Map<String, Model> countPair = new HashMap<String, Model>();

    test.forEach((k, v) -> System.out.println("Key:" + k + " Pair:" + v.getName() + " Status:" + v.getStatus()));

    System.out.println(" With Count !!!!");
    List<Model> list = new ArrayList<Model>();

    test.entrySet().stream().forEach(e -> list.add(e.getValue()));

    // Total
    Map<String, Long> counted = list.stream().collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
    counted.forEach((k, v) -> {
        countPair.put(k, new Model.ModelBuilder().setTotal(v).build());
    });

    // Inprogress
    counted = list.stream().filter(e -> e.getStatus().equals("Inprogress"))
            .collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
    counted.forEach((k, v) -> countPair.get(k).setInProgressCount(v));

    // Cancel
    counted = list.stream().filter(e -> e.getStatus().equals("Cancel"))
            .collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
    counted.forEach((k, v) -> countPair.get(k).setCancelCount(v));

    // Completed
    counted = list.stream().filter(e -> e.getStatus().equals("Completed"))
            .collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
    counted.forEach((k, v) -> countPair.get(k).setCompletedCount(v));

    countPair.forEach((k, v) -> System.out.println("Pair : " + k + " : " + v.getTotal() + " , "
            + v.getInProgressCount() + " , " + v.getCancelCount() + " , " + v.getCompletedCount()));

}
public class Model {

    private String name;
    private String status;

    private long total;
    private long inProgressCount;
    private long completedCount;
    private long cancelCount;

    static class ModelBuilder {
        private long total;
        private long inProgressCount;
        private long completedCount;
        private long cancelCount;

        public ModelBuilder setTotal(long total) {
            this.total = total;
            return this;
        }

        public ModelBuilder setInProgressCount(long inProgressCount) {
            this.inProgressCount = inProgressCount;
            return this;
        }

        public ModelBuilder setCompletedCount(long completedCount) {
            this.completedCount = completedCount;
            return this;
        }

        public ModelBuilder setCancelCount(long cancelCount) {
            this.cancelCount = cancelCount;
            return this;
        }

        public Model build() {
            return new Model(this);
        }

    }

    public Model(ModelBuilder modelBuilder) {
        this.total = modelBuilder.total;
        this.inProgressCount = modelBuilder.inProgressCount;
        this.completedCount = modelBuilder.completedCount;
        this.cancelCount = modelBuilder.cancelCount;
    }

    public Model(String name, String status) {
        super();
        this.name = name;
        this.status = status;
    }
    //getter and setter
}
英文:

How to use groupby with counting function ?

I have following data and I want following output :

如何在Java 8中使用groupby函数计算计数?

Desired output :

Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED
-----------------------------------------------
HCL   |     4 |          2 |      1 |        1
TCS   |     5 |          2 |      0 |        2

NOTE : I stored this data in hashmap. Key : OrderId, Value : OrderModel (PairName, staus)

I can achieve with following code. But I want more correct code. Please suggest me if you have other ways.

public static void main(String[] args) {
Map&lt;Integer, Model&gt; test = new HashMap&lt;Integer, Model&gt;();
Model m1 = new Model(&quot;HCL&quot;, &quot;Inprogress&quot;);
Model m2 = new Model(&quot;HCL&quot;, &quot;Cancel&quot;);
Model m3 = new Model(&quot;HCL&quot;, &quot;Inprogress&quot;);
Model m4 = new Model(&quot;HCL&quot;, &quot;Completed&quot;);
Model a1 = new Model(&quot;TCS&quot;, &quot;Inprogress&quot;);
Model a2 = new Model(&quot;TCS&quot;, &quot;Inprogress&quot;);
Model a3 = new Model(&quot;TCS&quot;, &quot;Inprogress&quot;);
Model a4 = new Model(&quot;TCS&quot;, &quot;Completed&quot;);
Model a5 = new Model(&quot;TCS&quot;, &quot;Completed&quot;);
int count = 1;
test.put(count++, m1);
test.put(count++, m2);
test.put(count++, m3);
test.put(count++, m4);
test.put(count++, a1);
test.put(count++, a2);
test.put(count++, a3);
test.put(count++, a4);
test.put(count, a5);
Map&lt;String, Model&gt; countPair = new HashMap&lt;String, Model&gt;();
test.forEach((k, v) -&gt; System.out.println(&quot;Key:&quot; + k + &quot; Pair:&quot; + v.getName() + &quot; Status:&quot; + v.getStatus()));
System.out.println(&quot; With Count !!!!&quot;);
List&lt;Model&gt; list = new ArrayList&lt;Model&gt;();
test.entrySet().stream().forEach(e -&gt; list.add(e.getValue()));
// Total
Map&lt;String, Long&gt; counted = list.stream().collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -&gt; {
countPair.put(k, new Model.ModelBuilder().setTotal(v).build());
});
// Inprogress
counted = list.stream().filter(e -&gt; e.getStatus().equals(&quot;Inprogress&quot;))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -&gt; countPair.get(k).setInProgressCount(v));
// Cancel
counted = list.stream().filter(e -&gt; e.getStatus().equals(&quot;Cancel&quot;))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -&gt; countPair.get(k).setCancelCount(v));
// Completed
counted = list.stream().filter(e -&gt; e.getStatus().equals(&quot;Completed&quot;))
.collect(Collectors.groupingBy(Model::getName, Collectors.counting()));
counted.forEach((k, v) -&gt; countPair.get(k).setCompletedCount(v));
countPair.forEach((k, v) -&gt; System.out.println(&quot;Pair : &quot; + k + &quot; : &quot; + v.getTotal() + &quot; , &quot;
+ v.getInProgressCount() + &quot; , &quot; + v.getCancelCount() + &quot; , &quot; + v.getCompletedCount()));
}

Model :

public class Model {
private String name;
private String status;
private long total;
private long inProgressCount;
private long completedCount;
private long cancelCount;
static class ModelBuilder {
private long total;
private long inProgressCount;
private long completedCount;
private long cancelCount;
public ModelBuilder setTotal(long total) {
this.total = total;
return this;
}
public ModelBuilder setInProgressCount(long inProgressCount) {
this.inProgressCount = inProgressCount;
return this;
}
public ModelBuilder setCompletedCount(long completedCount) {
this.completedCount = completedCount;
return this;
}
public ModelBuilder setCancelCount(long cancelCount) {
this.cancelCount = cancelCount;
return this;
}
public Model build() {
return new Model(this);
}
}
public Model(ModelBuilder modelBuilder) {
this.total = modelBuilder.total;
this.inProgressCount = modelBuilder.inProgressCount;
this.completedCount = modelBuilder.completedCount;
this.cancelCount = modelBuilder.cancelCount;
}
public Model(String name, String status) {
super();
this.name = name;
this.status = status;
}
//getter and setter
}

答案1

得分: 2

以下是翻译好的内容:

List<Model> test = Arrays.asList(
new Model("HCL", "Inprogress"),
new Model("HCL", "Cancel"),
new Model("HCL", "Inprogress"),
new Model("HCL", "Completed"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Inprogress"),
new Model("TCS", "Completed"),
new Model("TCS", "Completed")
);
Map<String, Map<String, Long>> result = test.stream()
.collect(Collectors.groupingBy(Model::getName,
Collectors.groupingBy(Model::getStatus,
Collectors.counting())));
result.entrySet().forEach(System.out::println);

输出

HCL={Cancel=1, Completed=1, Inprogress=2}
TCS={Completed=2, Inprogress=3}

使用该result对象生成所需的输出不应该有问题,因为所有的重 lifting(繁重工作)已经完成。

System.out.println("Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED");
System.out.println("-----------------------------------------------");
for (Entry<String, Map<String, Long>> nameEntry : result.entrySet()) {
String name = nameEntry.getKey();
Map<String, Long> statusCounts = nameEntry.getValue();
long inprogress = statusCounts.getOrDefault("Inprogress", 0L);
long cancel     = statusCounts.getOrDefault("Cancel"    , 0L);
long completed  = statusCounts.getOrDefault("Completed" , 0L);
System.out.printf("%-5s | %5d | %10d | %6d | %8d%n", name,
inprogress + cancel + completed,
inprogress, cancel, completed);
}

输出

Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED
-----------------------------------------------
HCL   |     4 |          2 |      1 |        1
TCS   |     5 |          3 |      0 |        2

注: 将此与问题中的“期望输出”进行比较,我们可以观察到问题中的示例是错误的,因为 TCS, INPROGRESS 是 3,而不是 2。

英文:

I would do it like this:

List&lt;Model&gt; test = Arrays.asList(
new Model(&quot;HCL&quot;, &quot;Inprogress&quot;),
new Model(&quot;HCL&quot;, &quot;Cancel&quot;),
new Model(&quot;HCL&quot;, &quot;Inprogress&quot;),
new Model(&quot;HCL&quot;, &quot;Completed&quot;),
new Model(&quot;TCS&quot;, &quot;Inprogress&quot;),
new Model(&quot;TCS&quot;, &quot;Inprogress&quot;),
new Model(&quot;TCS&quot;, &quot;Inprogress&quot;),
new Model(&quot;TCS&quot;, &quot;Completed&quot;),
new Model(&quot;TCS&quot;, &quot;Completed&quot;)
);
Map&lt;String, Map&lt;String, Long&gt;&gt; result = test.stream()
.collect(Collectors.groupingBy(Model::getName,
Collectors.groupingBy(Model::getStatus,
Collectors.counting())));
result.entrySet().forEach(System.out::println);

Output

HCL={Cancel=1, Completed=1, Inprogress=2}
TCS={Completed=2, Inprogress=3}

Shouldn't be a problem using that result object to produce the desired output, since all the heavy lifting has already been done.

System.out.println(&quot;Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED&quot;);
System.out.println(&quot;-----------------------------------------------&quot;);
for (Entry&lt;String, Map&lt;String, Long&gt;&gt; nameEntry : result.entrySet()) {
String name = nameEntry.getKey();
Map&lt;String, Long&gt; statusCounts = nameEntry.getValue();
long inprogress = statusCounts.getOrDefault(&quot;Inprogress&quot;, 0L);
long cancel     = statusCounts.getOrDefault(&quot;Cancel&quot;    , 0L);
long completed  = statusCounts.getOrDefault(&quot;Completed&quot; , 0L);
System.out.printf(&quot;%-5s | %5d | %10d | %6d | %8d%n&quot;, name,
inprogress + cancel + completed,
inprogress, cancel, completed);
}

Output

Stock | TOTAL | INPROGRESS | CANCEL | COMPLETED
-----------------------------------------------
HCL   |     4 |          2 |      1 |        1
TCS   |     5 |          3 |      0 |        2

Note: Comparing that to the "Desired output" in the question, we observe that the question example is wrong, since TCS, INPROGRESS is 3, not 2.

答案2

得分: 2

以下是翻译好的内容:

我同意Andreas关于最佳流解决方案的观点。
但因为您将其标记为性能,我会说我们应该比较执行时间,所以我确实比较了vijayk解决方案、Andreas解决方案和没有流的解决方案。

以下是列表中9个对象的结果:

  • 使用流的总执行时间(毫秒):7.642649毫秒
  • 使用for each的总执行时间(毫秒):0.037637毫秒
  • 使用Andreas的流的总执行时间(毫秒):0.392906毫秒
  • for each比vijayk更快:203.06211972261337倍
  • Andreas比vijayk更快:19.451596565081726倍
  • for each比Andreas更快:10.439354890134707倍

以下是列表中18,000,000个对象的结果:

  • 使用流的总执行时间(毫秒):703.025082毫秒
  • 使用for each的总执行时间(毫秒):278.319758毫秒
  • 使用Andreas的流的总执行时间(毫秒):504.190017毫秒
  • for each比vijayk更快:2.5259618183485197倍
  • Andreas比vijayk更快:1.3943653350835783倍
  • for each比Andreas更快:1.8115494948080546倍

然后我将流更改为并行流,结果如下:

对于列表中的9个对象:

  • 使用流的总执行时间(毫秒):20.937947毫秒
  • 使用for each的总执行时间(毫秒):0.042329毫秒
  • 使用Andreas的流的总执行时间(毫秒):0.496791毫秒
  • for each比vijayk更快:494.64780646837863倍
  • Andreas比vijayk更快:42.14638952799064倍
  • for each比Andreas更快:11.736421838455906倍

对于列表中的18,000,000个对象:

  • 使用流的总执行时间(毫秒):476.563756毫秒
  • 使用for each的总执行时间(毫秒):278.438998毫秒
  • 使用Andreas的流的总执行时间(毫秒):302.730519毫秒
  • for each比vijayk更快:1.7115553475738337倍
  • Andreas比vijayk更快:1.5742177484259523倍
  • for each比Andreas更快:1.087241805833535倍

每个解决方案的代码如下:

for (Model item : list) {
    Model itemToIncrease;
    if (countPair.containsKey(item.getName())) {
        itemToIncrease = countPair.get(item.getName());
    } else {
        countPair.put(item.getName(), item);
        itemToIncrease = item;
    }
    itemToIncrease.increaseTotal();
    switch (item.getStatus()) {
        case "Inprogress":
            itemToIncrease.increaseInProgressCount();
            break;
        case "Cancel":
            itemToIncrease.increaseCancelCount();
            break;
        case "Completed":
            itemToIncrease.increaseCompletedCount();
            break;
    }
}

总之,我会说Andreas的解决方案在处理大量数据时非常有效。

英文:

I would agree with Andreas on the part about best stream solution.
But becouse you tagged it with performance i would say that we should compare execution times so i did compare vijayk solution with Andreas solution and with solution without stream.

Here are the results for 9 objects in list:

Total execution with stream in ms: 7.642649ms
Total execution with for each in ms: 0.037637ms
Total execution with Andreas stream in ms: 0.392906ms
foreach was faster than vijayk by : 203.06211972261337
Andreas was faster than vijayk by : 19.451596565081726
foreach was faster than Andreas by : 10.439354890134707

Here are the results for 18 000 000 objects in list:

Total execution with stream in ms: 703.025082ms
Total execution with for each in ms: 278.319758ms
Total execution with Andreas stream in ms: 504.190017ms
foreach was faster than vijayk by : 2.5259618183485197
Andreas was faster than vijayk by : 1.3943653350835783
foreach was faster than Andreas by : 1.8115494948080546

Then I changed stream to parallel stream and results look like that

For 9 objects in list:

Total execution with stream in ms: 20.937947ms
Total execution with for each in ms: 0.042329ms
Total execution with Andreas stream in ms: 0.496791ms
foreach was faster than vijayk by : 494.64780646837863
Andreas was faster than vijayk by : 42.14638952799064
foreach was faster than Andreas by : 11.736421838455906

For 18 000 000 objects in list:

Total execution with stream in ms: 476.563756ms
Total execution with for each in ms: 278.438998ms
Total execution with Andreas stream in ms: 302.730519ms
foreach was faster than vijayk by : 1.7115553475738337
Andreas was faster than vijayk by : 1.5742177484259523
foreach was faster than Andreas by : 1.087241805833535

For each solution looks like that:

for (Model item : list) {
Model itemToIncrease;
if (countPair.containsKey(item.getName())) {
itemToIncrease = countPair.get(item.getName());
} else {
countPair.put(item.getName(), item);
itemToIncrease = item;
}
itemToIncrease.increaseTotal();
switch (item.getStatus()) {
case &quot;Inprogress&quot;:
itemToIncrease.increaseInProgressCount();
break;
case &quot;Cancel&quot;:
itemToIncrease.increaseCancelCount();
break;
case &quot;Completed&quot;:
itemToIncrease.increaseCompletedCount();
break;
}
}

To summ it up i would say that Andreas solution is very good when you have a lot of data

huangapple
  • 本文由 发表于 2020年9月9日 18:58:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/63810182.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定