问题

[
  {
    "date": "2022-04",
    "authors": { "Johnson": 2, "Smith": 1}
  },
  {
    "date": "2022-05",
    "authors": {"Johnson": 1, "Smith": 2, "Brooks": 1}
  }
]

英文:

I know theres other versions of this question on here, but I'm struggling to get this to line up. I have docs that are similar to:

[
{_id: 1, date: &quot;2022-04-08T23:30:12.000Z&quot;, books: [{author: &quot;Johnson&quot;, title: &quot;First Title&quot;}, {author: &quot;Smith&quot;, title: &quot;Second Title}]},
{_id: 2, date: &quot;2022-04-22T23:30:12.000Z&quot;, books: [{author: &quot;Johnson&quot;, title: &quot;Some Other Title&quot;}]},
{_id: 3, date: &quot;2022-05-05T23:30:12.000Z&quot;, books: [{author: &quot;Smith&quot;, title: &quot;Title Round 2&quot;}]},
{_id: 4, date: &quot;2022-05-15T23:30:12.000Z&quot;, books: [{author: &quot;Johnson&quot;, title: &quot;Found a Title&quot;, {author: &quot;Smith&quot;, title: &quot;Wrote again&quot;}, {author: &quot;Brooks&quot;, title: &quot;New Title&quot;}]}
]

I'm trying to group the documents by month-year and then run a count on the times a distinct value shows on the author field. So far I have a pipeline that looks like:

{
          &quot;$unwind&quot;: &quot;$books&quot;
        },
        {
          $project: {
            _id: 1,
            books: 1,
            month: {
              &quot;$month&quot;: &quot;$date&quot;
            },
            year: {
              &quot;$year&quot;: &quot;$date&quot;
            }
          }
        },
        {
          $project: {
            _id: 1,
            books: 1,
            date: {
              $concat: [
                {
                  $substr: [
                    &quot;$year&quot;,
                    0,
                    4
                  ]
                },
                &quot;-&quot;,
                {
                  $substr: [
                    &quot;$month&quot;,
                    0,
                    2
                  ]
                },
                
              ]
            }
          }
        },
        {
          $group: {
            _id: {
              date: &quot;$date&quot;,
              books: {
                freq: {
                  $sum: 1
                }
              }
            }
          }
        },
        {
          $project: {
            &quot;_id&quot;: 1,
            &quot;date&quot;: 1,
            &quot;books&quot;: 1
          }
        },
        
      ]
    }

I aiming for a final output that looks like:

[
{date: &quot;2022-04&quot;, authors: { &quot;Johnson&quot;: 2, &quot;Smith&quot;: 1}},
{date: &quot;2022-05&quot;, authors: {&quot;Johnson&quot;: 1, &quot;Smith&quot;: 2, &quot;Brooks&quot;: 1}}
]

I've seen ways to run a count on the subdocs, but in trying to implement I'm losing my group by date or just getting errors. I've seen enough to know its doable, just lost trying to get it just right. Any help is appreciated.

答案1

得分: 1

你的前三个阶段可以保持不变（假设日期存储为日期对象而不是字符串）。

之后：

按 $date 和 $books.author 字段分组，并计算每个组的出现次数。这将给你最终答案中所需的计数。
然后只按 $date 分组，并将每个计数推送到一个 authors 数组中，格式为键值对 {k:key,v:value}，以便在下一个阶段中将其转换为对象。
在 authors 数组上使用 $arrayToObject 将其转换为对象。

如果你还想对日期进行排序，请添加一个 { $sort: {date: 1 } } 阶段。

db.collection.aggregate([
  { $unwind: "$books" },
  { $project: { _id: 1, books: 1, month: { "$month": "$date" }, year: { "$year": "$date" } } },
  { $project: { _id: 1, books: 1, date: { $concat: [ { $substr: [ "$year", 0, 4 ] }, "-", { $substr: [ "$month", 0, 2 ] } ] } } },
  {
    $group: {
      _id: { date: "$date", author: "$books.author" },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.date",
      authors: { $push: { k: "$_id.author", v: "$count" } }
    }
  },
  {
    $project: {
      _id: 0,
      date: "$_id",
      authors: { $arrayToObject: "$authors" }
    }
  }
])

playground 你可以从顶部的 "Stage" 下拉菜单中查看中间结果。

英文:

your first 3 stages can stay the same (assuming the dates are stored as date objects and not as strings)

after that

group by $date and $books.author fields and count the occurrence of each group. This will give the counts you need in the final answer
Then group by $date only and push each count to an authors array in the format of key value {k:key,v:value} so that it can be converted to an object in next stage
$arrayToObject on authors array to convert it to an object

if you also want to sort the date add a { $sort: {date: 1 } } stage

db.collection.aggregate([
  { $unwind: &quot;$books&quot; },
  { $project: { _id: 1, books: 1, month: { &quot;$month&quot;: &quot;$date&quot; }, year: { &quot;$year&quot;: &quot;$date&quot; } } },
  { $project: { _id: 1, books: 1, date: { $concat: [ { $substr: [ &quot;$year&quot;, 0, 4 ] }, &quot;-&quot;, { $substr: [ &quot;$month&quot;, 0, 2 ] } ] } } },
  {
    $group: {
      _id: { date: &quot;$date&quot;, author: &quot;$books.author&quot; },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: &quot;$_id.date&quot;,
      authors: { $push: { k: &quot;$_id.author&quot;, v: &quot;$count&quot; } }
    }
  },
  {
    $project: {
      _id: 0,
      date: &quot;$_id&quot;,
      authors: { $arrayToObject: &quot;$authors&quot; }
    }
  }
])

playground you can look at the intermediate results from the Stage dropdown on top

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

MongoDB聚合按日期分组并计算子文档数量

问题

答案1

在mongoose中使用正则表达式过滤部分单词？

MongoDB C#客户端在使用相同操作符的$or查询时出现问题。

检查值是否存在于一个对象数组中，使用Golang。

MongoDB（Mgo v2）投影返回父结构体。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论