2023年5月25日 21:04:22go评论76阅读模式

英文:

TopN aggregations on spring-data-mongodb

问题

我正在尝试使用spring-data-mongo实现使用TopN聚合操作符的分组，但我不知道如何做。

从MongoDB的角度来看，我知道我想要的是这样的：

[ { 
    $match: { 
      field000: { $regex: ".*MATCHTHIS.*" }, 
      created: { $lte: new Date("2030-05-25T00:00:00.000+00:00" ) } 
    }, 
  }, 
  { 
     $group: { 
       _id: "$field001", 
       field001s: { 
         $topN: { 
           output: ["$field002", "$created"], 
           sortBy: { created: -1 }, 
           n: 1 
         }
       }
     }
    }]

这意味着...对于已经通过$match子句过滤的文档集合；按field001分组，按created降序排序每个桶，并选择前N个（这里是1个）。因此，每个组类别的最近创建的文档。

我在将其翻译成spring-data-mongo时遇到了问题。

英文:

I am trying to implement a grouping with a TopN aggregation operator using spring-data-mongo and I am lost on how to do it.

I know what I want from the POV of MongoDB. It's something like this:

[ { 
    $match: { 
      field000: { $regex: &quot;.*MATCHTHIS.*&quot; }, 
      created: { $lte: new Date(&quot;2030-05-25T00:00:00.000+00:00&quot; ) } 
    }, 
  }, 
  { 
     $group: { 
       _id: &quot;$field001&quot;, 
       field001s: { 
         $topN: { 
           output: [&quot;$field002&quot;, &quot;$created&quot;], 
           sortBy: { created: -1, }, 
           n: 1, 
         }
       }
     }
    }]

Meaning ... for the set of documents already filtered by the $match clause; group by field001, order each bucket by created desc, and pick the top (1). So the most recently created documents for each group category.

I find problems to translate this into spring-data-mongo

答案1

得分: 1

使用 MongoRepository，你可以使用 @Aggregate 注解来指定管道操作。类似于以下内容：

@Aggregation(pipeline = {"{ $match: { field000: { $regex: '?0' }, created: { $lte: '?1' } },}, { $group: { _id: '$field001', field001s: { $topN: { output: ['$field002', '$created'], sortBy: { created: -1, }, n: 1}}}}"})
Object filterAndGroup(String regex, ZonedDateTime created);

请注意，我已参数化了搜索正则表达式和日期值。请根据需要更新它，以及函数的返回类型。使用 MongoTemplate，你可以尝试类似以下内容的操作：

MatchOperation matchStage = Aggregation.match(
   new Criteria("field000").regex(".*MATCHTHIS.*")
     .and(new Criteria("created").lte(YOUR_JAVA_DATE_OBJECT))
);
ProjectionOperation projectStage = Aggregation.project("field002", "created", "field001");
SortOperation sortByCreatedDesc = sort(Sort.by(Direction.DESC, "created"));
GroupOperation groupStage = Aggregation.group("field001").first("$$ROOT").as("field001s");

Aggregation aggregation = newAggregation(
  matchStage, projectStage, sortByCreatedDesc, groupStage);
AggregationResults<XYZModel> result = mongoTemplate.aggregate(
  aggregation, "collectionName", XYZModel.class);

请注意，我已添加了两个新的阶段，用于 Projection 和 Sorting，因为 $topN 目前尚不受 Spring Data MongoDB 支持，所以我正在投影所需的字段，然后按 created 排序，然后对文档进行分组，并选择每个组中的第一个文档。

注意：这个答案未经过我的测试，所以你需要自行尝试并进行调整。

英文:

Using MongoRepository, you can specify the pipeline using, @Aggregate annotation. Something like this:

@Aggregation(pipeline = {&quot;{ $match: { field000: { $regex: &#39;?0&#39; }, created: { $lte: &#39;?1&#39; } },}, { $group: { _id: &#39;$field001&#39;, field001s: { $topN: { output: [&#39;$field002&#39;, &#39;$created&#39;], sortBy: { created: -1, }, n: 1}}}}&quot;})
Object filterAndGroup(String regex, ZonedDateTime created);

Note that I have parameterized the search regex and date value. Please update it accordingly, along with the return type of the function. Using MongoTemplate, you can try something along these lines.

MatchOperation matchStage = Aggregation.match(
   new Criteria(&quot;field000&quot;).regex(&quot;.*MATCHTHIS.*&quot;)
     .and(new Criteria(&quot;created&quot;).lte(YOUR_JAVA_DATE_OBJECT))
);
ProjectionOperation projectStage = Aggregation.project(&quot;field002&quot;, &quot;created&quot;, &quot;field001&quot;);
SortOperation sortByCreatedDesc = sort(Sort.by(Direction.DESC, &quot;created&quot;));
GroupOperation groupStage = Aggregation.group(&quot;field001&quot;).first(&quot;$$ROOT&quot;).as(&quot;field001s&quot;);

Aggregation aggregation = newAggregation(
  matchStage, projectStage, sortByCreatedDesc, groupStage);
AggregationResults&lt;XYZModel&gt; result = mongoTemplate.aggregate(
  aggregation, &quot;collectionName&quot;, XYZModel.class);

Note that, I have added two new stages for Projection and Sorting, as $topN is not yet supported by Spring Data MongoDB, so I am projecting the necessary fields, then sorting by created, and then grouping the document and picking the first one in each group.

Note: The answer is not tested by me, so you will have to try it out and adjust it.

答案2

得分: 0

在进行了大量研究后，我意识到在最新版本的spring-data-mongodb中，"topN"聚合操作符已经实现，这个问题可能很容易解决。然而在我的情况下，升级整个技术栈以支持那个版本并不是一个选项。

另一方面，如果你正在使用repositories，那么@Charchit Kapoor的解决方案可能是最佳的选择。

如果你想继续使用mongo模板，可以这样做：

```java
    AggregateIterable<Document> result = mongoOps.getCollection(COLLECTION_NAME).aggregate(Arrays.asList(new Document("$match",
        new Document("field000",
          new Document("$regex", ".*REGEXP_FIELD_000.*"))
          .append("created",
            new Document("$lte",
              new java.util.Date(1685318400000L))))),
      new Document("$group",
        new Document("_id", "$field001")
          .append("Field0001s",
            new Document("$topN",
              new Document("output", Arrays.asList("$field002", "$created"))
                .append("sortBy",
                  new Document("created", -1L))
                .append("n", 1L))))));

你可以将这个分成不同的方法以提高可读性。


<details>
<summary>英文:</summary>

After a lot of research, I realized this question is probably easily solvable on spring-data-mongodb most recent versions since &quot;topN&quot; aggregation operator is implemented. However in my case upgrading the stack to support that version is not an option. 

On the other hand, if you are using repositories then the @Charchit Kapoor solution is probably the best one.

If you want to stick with mongo template it can be done this way:

AggregateIterable&lt;Document&gt; result = mongoOps.getCollection(COLLECTION_NAME).aggregate(Arrays.asList(new Document(&quot;$match&quot;,
    new Document(&quot;field000&quot;,
      new Document(&quot;$regex&quot;, &quot;.*REGEXP_FIELD_000.*&quot;))
      .append(&quot;created&quot;,
        new Document(&quot;$lte&quot;,
          new java.util.Date(1685318400000L)))),
  new Document(&quot;$group&quot;,
    new Document(&quot;_id&quot;, &quot;$field001&quot;)
      .append(&quot;Field0001s&quot;,
        new Document(&quot;$topN&quot;,
          new Document(&quot;output&quot;, Arrays.asList(&quot;$field002&quot;, &quot;$created&quot;))
            .append(&quot;sortBy&quot;,
              new Document(&quot;created&quot;, -1L))
            .append(&quot;n&quot;, 1L))))));


You can split this into different methods for better readability.

</details>

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“TopN aggregations on spring-data-mongodb” 可翻译为 “在spring-data-mongodb上的TopN聚合”

问题

答案1

答案2

mgo中的关系

使用mgo在Go语言中操作MongoDB时，使用bson.M / bson.D操作符总是出现语法错误。

如何使用MongoDB聚合只获取那些索引可被5整除的记录？

如何使用 Prisma 在 MongoDB 中查询对象

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论