英文:
How to sort a composite aggregation on the basis of a sub aggregation ? Below is the query
问题
GET myIndex/_search
{
"from": 0,
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"user_id": {
"value": "a88604b0",
"boost": 1
}
}
},
{
"term": {
"entity_status.keyword": {
"value": "ACTIVE",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"txn_date": {
"order": "desc"
}
}
],
"aggs": {
"my_buckets": {
"composite": {
"sources": [
{
"group_by": {
"terms": {
"field": "category"
}
}
}
],
"size": 10000 // Increase size to a sufficiently large number to accommodate all buckets
},
"aggs": {
"total_amount": {
"sum": {
"field": "amount"
}
}
}
},
"sorted_buckets": {
"terms": {
"field": "my_buckets.total_amount.value",
"order": "desc"
},
"aggs": {
"top_categories": {
"top_hits": {
"size": 1
}
}
}
}
}
}
英文:
GET myIndex/_search
{
"from": 0,
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"user_id": {
"value": "a88604b0",
"boost": 1
}
}
},
{
"term": {
"entity_status.keyword": {
"value": "ACTIVE",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"txn_date": {
"order": "desc"
}
}
],
"aggs": {
"my_buckets": {
"composite": {
"sources": [
{
"group_by": {
"terms": {
"field": "category"
}
}
}
]
},
"aggs": {
"total_amount": {
"sum": {
"field": "amount"
}
}
}
}
}
}
I am executing above query but I want the aggregations to be sorted by sub-aggregation
total_amount
in descending order. Any modification or other ways to achieve this ?
Here is the result of the above query.
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 4,
"successful" : 4,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 22,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"my_buckets" : {
"after_key" : {
"group_by" : "Travel"
},
"buckets" : [
{
"key" : {
"group_by" : "Bills"
},
"doc_count" : 2,
"total_amount" : {
"value" : 86710.44
}
},
{
"key" : {
"group_by" : "Grocery"
},
"doc_count" : 1,
"total_amount" : {
"value" : 43355.22
}
},
{
"key" : {
"group_by" : "Fashion"
},
"doc_count" : 5,
"total_amount" : {
"value" : 216776.1
}
},
{
"key" : {
"group_by" : "Recharge"
},
"doc_count" : 7,
"total_amount" : {
"value" : 303486.54
}
},
{
"key" : {
"group_by" : "Shopping"
},
"doc_count" : 2,
"total_amount" : {
"value" : 86710.44
}
},
{
"key" : {
"group_by" : "Travel"
},
"doc_count" : 5,
"total_amount" : {
"value" : 216776.1
}
}
]
}
}
}
I want to have the aggregations to be in sorted manner according to total_amount
.
答案1
得分: 3
很遗憾,目前这是不可能的。每个来源可以按升序或降序排序,但基本上就是这样。
按子聚合排序将需要收集所有复合键,并计算每个存储桶的总金额,这在内存方面将非常昂贵,而且正好与复合聚合试图实现的相反,即通过具有非常低内存占用的方式对存储桶进行分页。
此外,请注意,如果您的类别基数较低(<1000),您实际上不需要复合聚合,可以通过terms
聚合实现您所需的功能,如下所示:
{
...
"aggs": {
"group_by": {
"terms": {
"field": "category",
"size": 100,
"order": {
"total_amount": "desc"
}
},
"aggs": {
"total_amount": {
"sum": {
"field": "amount"
}
}
}
}
}
}
英文:
Unfortunately, this is not possible right now. Each source can be ordered in ascending or descending order, but that's pretty much it.
Ordering by a sub-aggregation would require gathering all compound keys and computing the total amount for each bucket, which would be very costly in terms of memory and exactly the opposite of what the composite aggregation is trying to achieve, i.e. a way to paginate through buckets with a very low memory footprint
Also note that if you have a low cardinality of categories (<1000), you don't really need the composite aggregation, you can achieve what you need with the terms
aggregation, like this:
{
...
"aggs": {
"group_by": {
"terms": {
"field": "category",
"size": 100,
"order": {
"total_amount": "desc"
}
},
"aggs": {
"total_amount": {
"sum": {
"field": "amount"
}
}
}
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论