2023年2月16日 16:30:08go评论57阅读模式

英文:

two level nested aggregation in elastic search based on condition over first level aggregation

问题

我的ES文档结构如下：

{
"_index": "my_index",
"_type": "_doc",
"_id": "1296",
"_version": 1,
"_seq_no": 431,
"_primary_term": 1,
"_routing": "1296",
"found": true,
"_source": {
"id": 1296,
"test_name": "abc",
"test_id": 513,
"inventory_arr": [
{
"city": "bangalore",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34
},
{
"lab_id": 363,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408
}
]
},
{
"city": "mumbai",
"after_tat": 168,
"before_tat": 54,
"popularity_score": 15,
"rank": 0,
"discounted_price": 710,
"labs": [
{
"lab_id": 395,
"lab_name": "Prednalytics Laboratory",
"lab_rating": 34
},
{
"lab_id": 380,
"lab_name": "Neuberg Diagnostics",
"lab_rating": 408
}
]
}
]
}
}

我想知道在班加罗尔的每个实验室中进行了多少次测试。
我面临的问题是：
如果使用嵌套聚合按lab_id进行分组，那么它会按每个实验室进行分组，无论在哪个城市。

假设我的文档中只有一条记录，那么我期望的答案是像这样的城市班加罗尔

[
    {key: 395, doc_count: 1},
    {key: 363, doc_count: 1}
]

注意：每个城市中的实验室ID可能会重复。

英文:

My ES document structure is like this:

{
&quot;_index&quot;: &quot;my_index&quot;,
&quot;_type&quot;: &quot;_doc&quot;,
&quot;_id&quot;: &quot;1296&quot;,
&quot;_version&quot;: 1,
&quot;_seq_no&quot;: 431,
&quot;_primary_term&quot;: 1,
&quot;_routing&quot;: &quot;1296&quot;,
&quot;found&quot;: true,
&quot;_source&quot;: {
	&quot;id&quot;: 1296,
	&quot;test_name&quot;: &quot;abc&quot;
	&quot;test_id&quot;: 513
	&quot;inventory_arr&quot;[
		{
			&quot;city&quot;: &quot;bangalore&quot;,
			&quot;after_tat&quot;: 168,
			&quot;before_tat&quot;: 54,
			&quot;popularity_score&quot;: 15,
			&quot;rank&quot;: 0,
			&quot;discounted_price&quot;: 710,
			&quot;labs&quot;: [
				{
					&quot;lab_id&quot;: 395,
					&quot;lab_name&quot;: &quot;Prednalytics Laboratory&quot;,
					&quot;lab_rating&quot;: 34,
				},
				{
					&quot;lab_id&quot;: 363,
					&quot;lab_name&quot;: &quot;Neuberg Diagnostics&quot;,
					&quot;lab_rating&quot;: 408,
				}
			]
		},
		{
			&quot;city&quot;: &quot;mumbai&quot;,
			&quot;after_tat&quot;: 168,
			&quot;before_tat&quot;: 54,
			&quot;popularity_score&quot;: 15,
			&quot;rank&quot;: 0,
			&quot;discounted_price&quot;: 710,
			&quot;labs&quot;: [
				{
					&quot;lab_id&quot;: 395,
					&quot;lab_name&quot;: &quot;Prednalytics Laboratory&quot;,
					&quot;lab_rating&quot;: 34,
				},
				{
					&quot;lab_id&quot;: 380,
					&quot;lab_name&quot;: &quot;Neuberg Diagnostics&quot;,
					&quot;lab_rating&quot;: 408,
				}
			]
		}
	]
}

}

I want to know how many tests are performed in each lab that is in Bangalore.
The problem I'm facing that:
If grouping by lab_id using nested aggregation than it group by each lab no matter in which city it is.

Suppose there is only one record in my doc then I'm expecting answer like this for city Bangalore

[
{key: 395, doc_count: 1}
{key: 363, doc_count: 1}
]

Note: lab id can be duplicated in each city.

答案1

得分: 1

这个问题可以使用过滤聚合（filter aggregation）来解决。

当你使用嵌套聚合时，你是在嵌套文档上进行迭代。过滤聚合会过滤掉那些不符合你在内部提供的过滤查询条件的嵌套文档。在你的情况下，你希望过滤掉不在班加罗尔市内的嵌套文档。在移除了这些嵌套文档之后，你可以在lab_id字段上再使用另一个词桶（terms bucket）聚合。

祝你好运！

英文:

This problem can be solved using a filter aggregation.

When you are using a nested aggregation, you are iterating over the nested documents. The filter aggregation, filters out the nested documents that don't match the filter query that you provide inside. In your case you would want to filter out the nested documents that aren't inside the city of Bangalore. After you have removed those nested documents you can use another terms bucket aggregation on the lab_id.

Good luck!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Elasticsearch中，基于第一级聚合的条件进行两层嵌套聚合。

问题

答案1

Golang：如何构建用于Elasticsearch管道附件的结构体

Elastic Search “Terms” 查询未返回结果，但 “Match” 查询返回结果。

你可以使用Elasticsearch来查询数据库，而不是使用标准数据库吗？

将多个CSV文件索引到一个具有嵌套字段/对象的索引中

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论