mgo/mongodb: aggregate – find all and order by membercount however membercount is an array of member userids

huangapple go评论137阅读模式
英文:

mgo/mongodb: aggregate - find all and order by membercount however membercount is an array of member userids

问题

  1. MongoDB查询应该是这样的:
db.communities.aggregate([
  {
    $match: {
      category: "Art and Culture"
    }
  },
  {
    $addFields: {
      membercount: { $size: "$members" }
    }
  },
  {
    $sort: {
      membercount: 1
    }
  }
])
  1. 在Go/mgo中,可以这样实现:
func (ctx *CommunityContext) Directory() {
    pipe := ccommunity.Pipe([]bson.M{
        bson.M{
            "$match": bson.M{
                "category": "Art and Culture",
            },
        },
        bson.M{
            "$addFields": bson.M{
                "membercount": bson.M{
                    "$size": "$members",
                },
            },
        },
        bson.M{
            "$sort": bson.M{
                "membercount": 1,
            },
        },
    })
    iter := pipe.Iter()
    result := CommunityDirectory{}
    results := []CommunityDirectory{}
    for {
        if iter.Next(&result) {
            results = append(results, result)
            fmt.Println(result)
        } else {
            break
        }
    }
    ctx.JSON(results)
}

希望对你有所帮助!

英文:

The representation of 1 record (a community):

{
    "_id" : ObjectId("538a4734d6194c0e98000001"),
    "name" : "Darko",
    "description" : "Darko",
    "subdomain" : "darko",
    "domain" : "forum.dev",
    "created" : ISODate("2014-05-31T21:18:44.764Z"),
    "category" : "Art and Culture",
    "owner" : "53887456d6194c0f5b000001",
    "members" : [ 
        "53887456d6194c0f5b000001"
    ]
}

and the Go type

Community struct {
	Id          bson.ObjectId `bson:"_id,omitempty" json:"id"`
	Name        string        `json:"name"`
	Description string        `bson:",omitempty" json:"description"`
	Subdomain   string        `bson:",omitempty" json:"subdomain"`
	Domain      string        `json:"domain"`
	Created     time.Time     `json:"created"`
	Category    string        `json:"category"`
	Owner       string        `json:"owner"`
	Banned      []string      `bson:",omitempty" json:"banned"`
	Members     []string      `json:"members"`
	Moderators  []string      `bson:",omitempty" json:"moderators"`
	Admins      []string      `bson:",omitempty" json:"admins"`
	Logo        string        `bson:",omitempty" json:"logo"`
	Stylesheets []string      `bson:",omitempty" json:"stylesheets"`
	Javascripts []string      `bson:",omitempty" json:"javascripts"`
}

Ok now I'd like to retrieve a list of all communities of Category Art and Culture and order by the number of members aka members.length in js or len(Community.Members) in Go.

something like SELECT * FROM communities ORDER BY COUNT(members) WHERE category = 'Art and Culture'

I have a custom type to be filled or unmarshalled into

CommunityDirectory struct {
	Id          bson.ObjectId `bson:"_id,omitempty" json:"id"`
	Name        string        `json:"name"`
	Description string        `bson:",omitempty" json:"description"`
	Subdomain   string        `bson:",omitempty" json:"subdomain"`
	Domain      string        `json:"domain"`
	Created     time.Time     `json:"created"`
	Category    string        `json:"category"`
	Logo        string        `bson:",omitempty" json:"logo"`
	Membercount int64         `bson:"membercount" json:"membercount"`
}

What I have so far

func (ctx *CommunityContext) Directory() {
	pipe := ccommunity.Pipe([]bson.M{bson.M{"membercount": bson.M{"$size": "members"}}})
	iter := pipe.Iter()
	result := CommunityDirectory{}
	results := []CommunityDirectory{}
	for {
		if iter.Next(&result) {
			results = append(results, result)
			fmt.Println(result)
		} else {
			break
		}
	}
	ctx.JSON(results)
}

but this doesn't work because

db.communities.aggregate(
[
{"membercount": {$size:"members"}}
]
)

Error("Printing Stack Trace")@:0
()@src/mongo/shell/utils.js:37
([object Array])@src/mongo/shell/collection.js:866
@(shell):3

uncaught exception: aggregate failed: {
	"errmsg" : "exception: Unrecognized pipeline stage name: 'membercount'",
	"code" : 16436,
	"ok" : 0
}

So, it should find all, order by membercount and assign a new "virtual" field membercount but only of category 'Art and Culture'.

I find MongoDB quite complicated in this regard.

  1. What does the mongodb query look like?

  2. What does that look like in Go/mgo?

答案1

得分: 9

当你刚接触聚合框架时,有一些概念需要熟悉。

以下是在shell中正确使用聚合管道的形式:

db.communties.aggregate([
    // 首先匹配文档以过滤和可能利用索引
    { "$match": {
        "category": "Art and Culture"
    }},
    // 当添加另一个字段时,你需要包含所有字段,并且你想要所有字段
    { "$project": {
        "name": 1,
        "description": 1,
        "subdomain": 1,
        "domain": 1,
        "created": 1,
        "category": 1,
        "owner": 1,
        "members": 1,
        "memberCount": { "$size": "$members" }
    }},
    // $sort 表示 "ORDER BY",在这种情况下是升序排序
    { "$sort": { "memberCount": 1 } },
    // 可选地只投影结果中需要的字段
    { "$project": {
        "name": 1,
        "description": 1,
        "subdomain": 1,
        "domain": 1,
        "created": 1,
        "category": 1,
        "owner": 1,
        "members": 1
    }}
])

因此,没有直接等价于 "SELECT *" 的方式,除非你不想改变结构。在这里,你需要添加一个字段 "memberCount",所以你需要指定所有字段。你可以使用 $$ROOT,它会复制文档中的所有字段,但你需要在 $project 中将其分配给另一个字段/属性,例如:

{ "$project": {
    "_id": "$$ROOT",
    "memberCount": 1
}}

但现在,你的 "fields" 并不完全与之前相同,而且都以 _id. 为前缀。但这是个人口味的问题。

下一个要习惯的是始终首先尝试使用 $match。这不仅有助于减少在聚合管道的其余部分上操作的文档数量,而且还是使用索引优化查询的唯一机会。一旦你使用其他阶段修改了文档,就无法再使用索引,因为这不再是最初被索引的源。在某种程度上与 SQL 不同,但语义上有些不同。记住,"pipeline" 就像 Unix 的 "pipe" | 运算符一样,所以先进行 "matching"。

排序有自己的管道阶段。因此,使用 $sort 运算符作为管道阶段来完成排序。

最后的 $project 是可选的。在这里,我们只是丢弃了用于 "sort" 的 "memberCount" 字段。

使用 mGo 的用法应该如下所示:

pipeline := []bson.D{
    bson.M{"$match": bson.M{"category": "Art and Culture"}},
    bson.M{"$project": bson.M{
        "name":        1,
        "description": 1,
        "subdomain":   1,
        "domain":      1,
        "created":     1,
        "category":    1,
        "owner":       1,
        "members":     1,
        "memberCount": bson.M{"$size": "$members"},
    }},
    bson.M{"$sort": bson.M{"memberCount": 1}},
    bson.M{"$project": bson.M{
        "name":        1,
        "description": 1,
        "subdomain":   1,
        "domain":      1,
        "created":     1,
        "category":    1,
        "owner":       1,
        "members":     1,
    }},
}

pipe := ccommunity.Pipe(pipeline)

所以实际上与大多数你在文档中找到的示例形式并没有太大区别。

可能还要查看核心文档中提供的 SQL to aggregation Mapping Chart,以获取其他常见 SQL 查询在聚合管道中的示例。

英文:

There are a few concepts to get used to when you are new to the aggregation framework

The correct form of the pipeline as it would work in the shell should be this:

<!-- language: lang-js -->

db.communties.aggregate([

    // Match the documents first to filter and possibly make use of an index
    { &quot;$match&quot;: {
        &quot;category&quot;: &quot;Art and Culture&quot;
    }},

    // You include all fields when adding another and you want all
    { &quot;$project&quot;: {
        &quot;name&quot;: 1,
        &quot;description&quot;: 1,
        &quot;subdomain&quot;: 1,
        &quot;domain&quot;: 1,
        &quot;created&quot;: 1,
        &quot;category&quot;: 1,
        &quot;owner&quot;: 1,
        &quot;members&quot;: 1,
        &quot;memberCount&quot;: { &quot;$size&quot;: &quot;$members&quot; }
    }},

    // $sort means &quot;ORDER BY&quot; in this case the ascending
    { &quot;$sort&quot;: { &quot;memberCount&quot;: 1 } },

    // Optionally project just the fields you need in the result
    { &quot;$project&quot;: {
        &quot;name&quot;: 1,
        &quot;description&quot;: 1,
        &quot;subdomain&quot;: 1,
        &quot;domain&quot;: 1,
        &quot;created&quot;: 1,
        &quot;category&quot;: 1,
        &quot;owner&quot;: 1,
        &quot;members&quot;: 1
    }}
])

So there really is no direct equivalent of "SELECT *" unless you don't want to alter the structure at all. Here you need to add a field "memberCount" so you need to specify all of the fields. You can possibly use $$ROOT which copies all the fields in the document but you would need to assign that to another field/property in your $project, such as:

{ &quot;$project&quot;: {
    &quot;_id&quot;: &quot;$$ROOT&quot;,
    &quot;memberCount&quot;: 1
 }}

But now of course all your "fields" are not exactly the same as what they were and are all prefixed with _id.. But that is a matter of personal taste.

The next thing to get used to is always try to use $match first. Not only does this help reduce the documents operated on over the rest of the aggregation pipeline, it is also the only chance you get to use an index to optimize your query. Once you modify the documents with other stages, it's all over for using an index as this is no longer the original source that was indexed. Not that different from SQL really, but the semantics differ a somewhat in how you specify. Remember, "pipeline" just like a Unix "pipe" | operator, so do the "matching" first.

Sorting has it's own pipeline stage. So use the $sort operator for the pipeline stage to do this.

The final $project is optional. Here we are just discarding the "memberCount" field that was used to "sort".


The usage with mGo should turn out like this:

pipeline := [].bson.D{
    bson.M{&quot;$match&quot;: bson.M{ &quot;category&quot;: &quot;Art and Culture&quot; } },

    bson.M{&quot;$project&quot;: bson.M{
        &quot;name&quot;: 1,
        &quot;description&quot;: 1,
        &quot;subdomain&quot;: 1,
        &quot;domain&quot;: 1,
        &quot;created&quot;: 1,
        &quot;category&quot;: 1,
        &quot;owner&quot;: 1,
        &quot;members&quot;: 1,
        &quot;memberCount&quot;: bson.M{ &quot;$size&quot;: &quot;$members&quot; }
    }},

    bson.M{ &quot;$sort&quot;: bson.M{ &quot;memberCount&quot;: 1 } },

    bson.M{ &quot;$project&quot;: bson.M{
        &quot;name&quot;: 1,
        &quot;description&quot;: 1,
        &quot;subdomain&quot;: 1,
        &quot;domain&quot;: 1,
        &quot;created&quot;: 1,
        &quot;category&quot;: 1,
        &quot;owner&quot;: 1,
        &quot;members&quot;: 1
    }}
}

pipe := ccommunity.Pipe( pipeline )

So really not that different to the form of most examples out there you will find.

Possibly look at the SQL to aggregation Mapping Chart provided in the core documentation for other examples of common SQL queries as they apply to the aggregation pipeline.

huangapple
  • 本文由 发表于 2014年6月2日 03:58:46
  • 转载请务必保留本文链接:https://go.coder-hub.com/23983902.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定