英文:
two level nested aggregation in elastic search based on condition over first level aggregation
问题
我的ES文档结构如下:
{
"_index": "my_index",
"_type": "_doc",
"_id": "1296",
"_version": 1,
"_seq_no": 431,
"_primary_term": 1,
"_routing": "1296",
"found": true,
"_source": {
"id": 1296,
"test_name": "abc",
"test_id": 513,
"inventory_arr": [
{
"city": "bangalore",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34
},
{
"lab_id": 363,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408
}
]
},
{
"city": "mumbai",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34
},
{
"lab_id": 380,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408
}
]
}
]
}
}
我想知道在班加罗尔的每个实验室中进行了多少次测试。
我面临的问题是:
如果使用嵌套聚合按lab_id进行分组,那么它会按每个实验室进行分组,无论在哪个城市。
假设我的文档中只有一条记录,那么我期望的答案是像这样的城市班加罗尔
[
{key: 395, doc_count: 1},
{key: 363, doc_count: 1}
]
注意:每个城市中的实验室ID可能会重复。
英文:
My ES document structure is like this:
{
"_index": "my_index",
"_type": "_doc",
"_id": "1296",
"_version": 1,
"_seq_no": 431,
"_primary_term": 1,
"_routing": "1296",
"found": true,
"_source": {
"id": 1296,
"test_name": "abc"
"test_id": 513
"inventory_arr"[
{
"city": "bangalore",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34,
},
{
"lab_id": 363,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408,
}
]
},
{
"city": "mumbai",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34,
},
{
"lab_id": 380,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408,
}
]
}
]
}
}
I want to know how many tests are performed in each lab that is in Bangalore.
The problem I'm facing that:
If grouping by lab_id using nested aggregation than it group by each lab no matter in which city it is.
Suppose there is only one record in my doc then I'm expecting answer like this for city Bangalore
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
[
{key: 395, doc_count: 1}
{key: 363, doc_count: 1}
]
<!-- end snippet -->
Note: lab id can be duplicated in each city.
答案1
得分: 1
这个问题可以使用过滤聚合(filter aggregation)来解决。
当你使用嵌套聚合时,你是在嵌套文档上进行迭代。过滤聚合会过滤掉那些不符合你在内部提供的过滤查询条件的嵌套文档。在你的情况下,你希望过滤掉不在班加罗尔市内的嵌套文档。在移除了这些嵌套文档之后,你可以在lab_id
字段上再使用另一个词桶(terms bucket)聚合。
祝你好运!
英文:
This problem can be solved using a filter aggregation.
When you are using a nested aggregation, you are iterating over the nested documents. The filter aggregation, filters out the nested documents that don't match the filter query that you provide inside. In your case you would want to filter out the nested documents that aren't inside the city of Bangalore. After you have removed those nested documents you can use another terms bucket aggregation on the lab_id.
Good luck!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论