如何根据子聚合对复合聚合进行排序?以下是查询代码。

huangapple go评论84阅读模式
英文:

How to sort a composite aggregation on the basis of a sub aggregation ? Below is the query

问题

GET myIndex/_search
{
  "from": 0,
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "user_id": {
              "value": "a88604b0",
              "boost": 1
            }
          }
        },
        {
          "term": {
            "entity_status.keyword": {
              "value": "ACTIVE",
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "txn_date": {
        "order": "desc"
      }
    }
  ], 
  "aggs": {
    "my_buckets": {
      "composite": {
        "sources": [
          {
            "group_by": {
              "terms": {
                "field": "category"
              }
            }
          }
        ],
        "size": 10000  // Increase size to a sufficiently large number to accommodate all buckets
      },
      "aggs": {
        "total_amount": {
          "sum": {
            "field": "amount"
          }
        }
      }
    },
    "sorted_buckets": {
      "terms": {
        "field": "my_buckets.total_amount.value",
        "order": "desc"
      },
      "aggs": {
        "top_categories": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}
英文:

GET myIndex/_search
{
  "from": 0,
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "user_id": {
              "value": "a88604b0",
              "boost": 1
            }
          }
        },
        {
          "term": {
            "entity_status.keyword": {
              "value": "ACTIVE",
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "txn_date": {
        "order": "desc"
      }
    }
  ], 
  "aggs": {
    "my_buckets": {
      "composite": {
        "sources": [
          {
            "group_by": {
              "terms": {
                "field": "category"
              }
            }
          }
        ]
      },
      "aggs": {
        "total_amount": {
          "sum": {
            "field": "amount"
          }
        }
      }
    }
  }
}

I am executing above query but I want the aggregations to be sorted by sub-aggregation
total_amount in descending order. Any modification or other ways to achieve this ?

Here is the result of the above query.

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 22,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "my_buckets" : {
      "after_key" : {
        "group_by" : "Travel"
      },
      "buckets" : [
        {
          "key" : {
            "group_by" : "Bills"
          },
          "doc_count" : 2,
          "total_amount" : {
            "value" : 86710.44
          }
        },
        {
          "key" : {
            "group_by" : "Grocery"
          },
          "doc_count" : 1,
          "total_amount" : {
            "value" : 43355.22
          }
        },
        {
          "key" : {
            "group_by" : "Fashion"
          },
          "doc_count" : 5,
          "total_amount" : {
            "value" : 216776.1
          }
        },
        {
          "key" : {
            "group_by" : "Recharge"
          },
          "doc_count" : 7,
          "total_amount" : {
            "value" : 303486.54
          }
        },
        {
          "key" : {
            "group_by" : "Shopping"
          },
          "doc_count" : 2,
          "total_amount" : {
            "value" : 86710.44
          }
        },
        {
          "key" : {
            "group_by" : "Travel"
          },
          "doc_count" : 5,
          "total_amount" : {
            "value" : 216776.1
          }
        }
      ]
    }
  }
}

I want to have the aggregations to be in sorted manner according to total_amount.

答案1

得分: 3

很遗憾,目前这是不可能的。每个来源可以按升序或降序排序,但基本上就是这样。

按子聚合排序将需要收集所有复合键,并计算每个存储桶的总金额,这在内存方面将非常昂贵,而且正好与复合聚合试图实现的相反,即通过具有非常低内存占用的方式对存储桶进行分页。

此外,请注意,如果您的类别基数较低(<1000),您实际上不需要复合聚合,可以通过terms聚合实现您所需的功能,如下所示:

{
  ...
  "aggs": {
    "group_by": {
      "terms": {
        "field": "category",
        "size": 100,
        "order": {
          "total_amount": "desc"
        }
      },
      "aggs": {
        "total_amount": {
          "sum": {
            "field": "amount"
          }
        }
      }
    }
  }
}
英文:

Unfortunately, this is not possible right now. Each source can be ordered in ascending or descending order, but that's pretty much it.

Ordering by a sub-aggregation would require gathering all compound keys and computing the total amount for each bucket, which would be very costly in terms of memory and exactly the opposite of what the composite aggregation is trying to achieve, i.e. a way to paginate through buckets with a very low memory footprint

Also note that if you have a low cardinality of categories (<1000), you don't really need the composite aggregation, you can achieve what you need with the terms aggregation, like this:

{
  ...
  &quot;aggs&quot;: {
    &quot;group_by&quot;: {
      &quot;terms&quot;: {
        &quot;field&quot;: &quot;category&quot;,
        &quot;size&quot;: 100,
        &quot;order&quot;: {
          &quot;total_amount&quot;: &quot;desc&quot;
        }
      },
      &quot;aggs&quot;: {
        &quot;total_amount&quot;: {
          &quot;sum&quot;: {
            &quot;field&quot;: &quot;amount&quot;
          }
        }
      }
    }
  }
}

huangapple
  • 本文由 发表于 2020年5月29日 18:57:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/62084334.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定