在一个MongoDB查询中计算选定用户的不同排名。

huangapple go评论132阅读模式
英文:

Calculating selected users' different rankings in one mongodb query

问题

以下是翻译好的部分:

我有一组用户,每个用户都有每日、每月和每年的得分。鉴于分数实时变化,我不想预先计算这些内容,只想在需要时为特定用户计算排名。是否有一种方法(分面查询或其他方式),可以一次性获取这些用户的排名?

以下是示例用户文档:

{
    "_id": "5e0a361d1ca215003e79f388",
    "score_day": 20,
    "score_week": 203,
    "score_month": 850
}

给定一个用户ID数组的预期结果:

[
    {
        "_id": "???",
        "rank_day": 42,
        "rank_month": 84,
        "rank_year": 65
    },
    {
        "_id": "???",
        "rank_day": 12,
        "rank_month": 8,
        "rank_year": 68
    },
    ...
]
英文:

I have a collection of users each having daily, monthly and yearly scores. Given that the scores change in real-time I don't want to precalculate this stuff and just want to calculate the ranks for certain users at a time. Is there a way (faceted query or ...) where I can get these rankings for users in one call?

Below is a sample user document:

{
    "_id": "5e0a361d1ca215003e79f388",
    "score_day": 20,
    "score_week": 203,
    "score_month": 850
}

Expected result given an array of user ids:

[
    {
        "_id": "???",
        "rank_day": 42,
        "rank_month": 84,
        "rank_year": 65
    },
    {
        "_id": "???",
        "rank_day": 12,
        "rank_month": 8,
        "rank_year": 68
    },
    ...
]

答案1

得分: 0

Mongo没有一种简单的方式来获取文档在排序后的排名,可以使用聚合框架来实现,但对于大型集合来说不是一个好主意,因为每次运行都需要在内存中对每个字段进行排序。

如果这个操作变得太慢,您可能需要考虑使用排序数据库实现,比如Redis的有序集合。Mongo不是为这种用例而构建的。

另外,请确保分数字段已经建立索引,以便排序可以利用这些索引。

英文:

Mongo doesn't have an easy way of getting the ranking of a document after a sort, it is possible using the aggregation framework, however, this is not a good idea for big collections as the data needs to be sorted in memory for each field every time this is run.

You may want to look into sorted database implementations such as Redis' sorted sets if this turns out to be too slow. Mongo wasn't built for this use-case.

Also please make sure the score fields are indexed so sorting can utilize those indexes.

var targetIds = ["5e0a361d1ca215003e79f388", "5e0a361d1ca215003e79f389", ...];

db.collection.aggregate([
    { // handle multiple pipelines at once on the same documents
        $facet: {
            rank_day: [
                { // we only need the `_id` and `score_XXX` field, this will make the query use less memory
                    $project: {
                        _id: 1,
                        score_day: 1
                    }
                },
                { // sort on the score
                    $sort: {
                        score_day: -1
                    }
                },
                { // push all documents, sorted, into an array
                    $group: {
                        _id: "",
                        rankings: {
                            $push: "$$ROOT"
                        }
                    }
                },
                { // unwind the formed array back into separate documents, but pass the index
                    $unwind: {
                        path: "$rankings",
                        includeArrayIndex: "rank"
                    }
                },
                { // find the documents we need
                    $match: {
                        "rankings._id": { "$in": targetIds }
                    }
                },
                { // we only need the _id and the rank, score is useless now
                    $project: {
                        _id: "$rankings._id",
                        rank: 1
                    }
                }
            ],
            // handle the 2 other rankings in same way
            rank_week: [
                {
                    $project: {
                        _id: 1,
                        score_week: 1
                    }
                },
                {
                    $sort: {
                        score_week: -1
                    }
                },
                {
                    $group: {
                        _id: "",
                        rankings: {
                            $push: "$$ROOT"
                        }
                    }
                },
                {
                    $unwind: {
                        path: "$rankings",
                        includeArrayIndex: "rank"
                    }
                },
                {
                    $match: {
                        "rankings._id": { "$in": targetIds }
                    }
                },
                {
                    $project: {
                        _id: "$rankings._id",
                        rank: 1
                    }
                }
            ],
            rank_month: [
                {
                    $project: {
                        _id: 1,
                        score_month: 1
                    }
                },
                {
                    $sort: {
                        score_month: -1
                    }
                },
                {
                    $group: {
                        _id: "",
                        rankings: {
                            $push: "$$ROOT"
                        }
                    }
                },
                {
                    $unwind: {
                        path: "$rankings",
                        includeArrayIndex: "rank"
                    }
                },
                {
                    $match: {
                        "rankings._id": { "$in": targetIds }
                    }
                },
                {
                    $project: {
                        _id: "$rankings._id",
                        rank: 1
                    }
                }
            ]
        }
    },


    // cleanup result to expected output
    { // add the searched _ids and unwind them so we can project and filter the arrays in each document
        $addFields: {
            _id: targetIds
        }
    },
    {
        $unwind: {
            path: "$_id"
        }
    },
    { // filter the rankings based on the _id
        $project: {
            _id: 1,
            rank_day: { $filter: { input: "$rank_day", as: "rank", cond: { $eq: ["$$rank._id", "$_id"] } } },
            rank_week: { $filter: { input: "$rank_week", as: "rank", cond: { $eq: ["$$rank._id", "$_id"] } } },
            rank_month: { $filter: { input: "$rank_month", as: "rank", cond: { $eq: ["$$rank._id", "$_id"] } } }
        }
    },
    { // cleanup the array result to the internal "rank" value
        $project: {
            _id: 1,
            rank_day: { $arrayElemAt: ["$rank_day.rank", 0] },
            rank_week: { $arrayElemAt: ["$rank_week.rank", 0] },
            rank_month: { $arrayElemAt: ["$rank_month.rank", 0] }
        }
    }
]);

答案2

得分: 0

根据您的问题,我理解您想根据分数给文档排名。尽管 MongoDB 没有提供直接的操作符来获取文档的排名,但有一个可以解决问题的方法,适用于那些需要计算排名的字段的值不同于两个以上文档,或者允许同一值的多个文档具有不同排名的集合。

以下是用于获取排名的管道操作:

[ 
    { $sort: { score_day: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_day" } }, 
    { $addFields: { "data.rank_day": { $add: ["$rank_day", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } }, 
    { $sort: { score_month: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_month" } }, 
    { $addFields: { "data.rank_month": { $add: ["$rank_month", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } }, 
    { $sort: { score_year: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_year" } }, 
    { $addFields: { "data.rank_year": { $add: ["$rank_year", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } } 
]

上述管道操作将根据分数给您排名,但如果有多个文档具有相同的分数值,则排名将根据它们在管道中出现的顺序分配。

如果您想在一定程度上避免这种情况,可以将所有分数相加,得到总分,然后根据日/月/年分数以及总分进行排序,这将帮助您获得更相关的排名。如果多个文档的日/月/年分数以及总分相同,这将无法帮助您。

如果您同意上述情景,可以使用以下管道操作:

[ 
    { $addFields: { overAllScore: { $add: ["$score_day", "$score_month", "$score_year"] } } }, 
    { $sort: { score_day: -1, overAllScore: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_day" } }, 
    { $addFields: { "data.rank_day": { $add: ["$rank_day", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } }, 
    { $sort: { score_month: -1, overAllScore: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_month" } }, 
    { $addFields: { "data.rank_month": { $add: ["$rank_month", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } }, 
    { $sort: { score_year: -1, overAllScore: -1 } }, 
    { $group: { _id: null, data: { $push: "$$ROOT" } } }, 
    { $unwind: { path: "$data", includeArrayIndex: "rank_year" } }, 
    { $addFields: { "data.rank_year": { $add: ["$rank_year", 1] } } }, 
    { $replaceRoot: { newRoot: "$data" } }
]

最后,您可以添加一个项目阶段,以排除分数字段,只显示所需的数据。

英文:

As I have understood from your question you want to give rank on the basis of scores. There is not any direct operator provided for getting the rank for documents by MongoDB yet. But there is a workaround that will work fine for those collections where the value in the field on which Rank is needed to be calculated is not the same for more than 2 documents or if it is okay to have different ranks for the same value of multiple documents.

[ 
{ $sort: { score_day: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_day" } }, 
{ $addFields: { "data.rank_day": { $add: [ "$rank_day", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } }, 
{ $sort: { score_month: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_month" } }, 
{ $addFields: { "data.rank_month": { $add: [ "$rank_month", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } }, 
{ $sort: { score_year: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_year" } }, 
{ $addFields: { "data.rank_year": { $add: [ "$rank_year", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } } 
]

The above pipeline will give you rank on the basis of the score but if there is more than 1 document with the same score value then the rank will be assigned according to there order of occurrence in the pipeline.

To avoid it to an extent what you can do is add all the scores and get an overall score and then you can sort it on the basis of day/month/year score along with overall score which will help you to get a little bit more relevant ranks. It will not gonna help if the day/week/year score is the same along with the overall score for more than 1 document.
If you agree with the above-given scenario then you can use the below-given pipeline.

[ 
{ $addFields: { overAllScore: { $add: [ "$score_day", "$score_month", "$score_year" ] } } }, 
{ $sort: { score_day: -1, overAllScore: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_day" } }, 
{ $addFields: { "data.rank_day": { $add: [ "$rank_day", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } }, 
{ $sort: { score_month: -1, overAllScore: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_month" } }, 
{ $addFields: { "data.rank_month": { $add: [ "$rank_month", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } }, 
{ $sort: { score_year: -1, overAllScore: -1 } }, 
{ $group: { _id: null, data: { $push: "$$ROOT" } } }, 
{ $unwind: { path: "$data", includeArrayIndex: "rank_year" } }, 
{ $addFields: { "data.rank_year": { $add: [ "$rank_year", 1 ] } } }, 
{ $replaceRoot: { newRoot: "$data" } }
]

At last, you can add a project stage to exclude score fields to show only the required data.

huangapple
  • 本文由 发表于 2020年1月6日 19:11:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/59611054.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定