返回弹性搜索查询中的嵌套字段

huangapple go评论90阅读模式
英文:

Return nested fields in elastic search query

问题

非常新手的弹性搜索。我有一个包含嵌套对象的数据集。当我在Web应用上进行关键字搜索时,我创建了一个具有bool should数组的查询。默认情况下,嵌套对象中的字段值不会在此查询的响应中返回 - 这些字段是我需要访问的。如果我使用嵌套查询,我可以使它们返回,但似乎无论我做什么,都会影响查询的结果。例如,如果我在bool should查询中添加一个嵌套字段,它会返回所有可能的结果,而不考虑搜索关键字,因为每个可能的结果都在这个嵌套字段中具有某个值。获取弹性搜索响应中嵌套字段的最佳方法是什么,而不会对响应产生任何影响?

示例查询:

"stored_fields": [
  "document_code",
  "filename"
],
"query": {
  "bool": {
    "should": [
      {
        "match": {
          "filename": "some filename"
        }
      }
    ]
  } 
}

此查询将仅返回与精确文档代码匹配的文档 - 正如应该的那样 - 但响应不包含可能与文档关联的任何嵌套字段。例如:

"total": {
  "value": 1
},
"hits": [
  {
    "fields": {
      "document_code": [
        "123456"
      ],
      "filename": [
        "some filename"
      ]
    }
  }
]

我需要响应中的嵌套字段,它们将出现在我使用以下查询时的响应中:

"stored_fields": [
  "document_code",
  "filename"
],
"query": {
  "bool": {
    "should": [
      {
        "match": {
          "filename": "some filename"
        }
      },
      {
        "nested": {
          "path": "nested_fields",
          "query": {
            "match_all": {}
          },
          "inner_hits": {
            "stored_fields": [
              "nested_fields.field_one",
              "nested_fields.field_two"
            ]
          }
        }
      }
    ]
  } 
}

这将为我提供如下响应:

"total": {
  "value": 10000
},
"hits": [
  {
    "fields": {
      "document_code": [
        "123456"
      ],
      "filename": [
        "some filename"
      ]
    },
    "inner_hits": {
      "nested_fields": {
        "hits": {
          "total": 1,
          "hits": [
            {
              "fields": {
                "nested_fields.field_one": ["nested field one value"],
                "nested_fields.field_two": ["nested field two value"]
              }
            } 
          ]
        }
      }
    }
  },
  {
    .....以及不匹配文档代码的其他10,000个命中
  }
]

我明白为什么我没有得到所需的结果。我基本上在说“给我匹配这个文档代码或具有任何嵌套字段的结果。由于它们都在这个嵌套字段中具有某个值,所以查询返回每个单个文档。我只是不知道如何让响应包含inner_hits值而不匹配任何内容。我曾尝试将嵌套查询中的"match_all"替换为"match_none",这将返回正确的结果(具有该文档代码的唯一文档),但inner_hits是空的,而我需要这些值。

还请注意,在我的实际应用程序中,将在文档的不同部分进行许多其他通配符和匹配以搜索某个关键字。

简而言之,我实际上并没有试图根据嵌套字段来筛选结果 - 我只需要它们出现在响应中。

英文:

Very new to elastic search.
I have a data set that includes nested objects. I have a query with a bool should array that is created when I do a keyword search on a web app. By default, the field values in the nested objects are not returned in the response for this query - these fields are what I need access to. I can get them to return if I use a nested query but it seems no matter what I do, I influence the results of the query. For example, if I add a nested field in the bool should query, it returns ALL possible results regardless of the search keyword because every possible result has SOME value in this nested field. What's the best way to get nested fields in an elastic search response without influencing the response at all?

Example query:

"stored_fields": [
  "document_code",
  "filename"
],
"query": {
  "bool": {
    "should": [
      {
        "match": {
          "filename": "some filename"
        }
      }
    ]
  } 
}

this query will return ONLY the document that matches the exact document code - as it should - but the response does not include any of the nested fields that might be associated with the document.
i.e.

"total": {
  "value": 1
}
"hits": [
  {
    "fields": {
      "document_code": [
        "123456"
      ],
      "filename": [
        "some filename"
      ]
   }
]

I need the nested fields in the response as they would appear if I used this query:

"stored_fields": [
  "document_code",
  "filename"
],
"query": {
  "bool": {
    "should": [
      {
        "match": {
          "filename": "some filename"
        }
      },
      {
        "nested": {
          "path": "nested_fields",
          "query": {
            "match_all" {}
          },
          "inner_hits": {
            "stored_fields": [
              "nested_fields.field_one",
              "nested_fields.field_two
            ]
          }
        }
      }
    ]
  } 
}

which gives me a response that looks like this:

"total": {
  "value": 10000
},
"hits": [
  {
    "fields": {
      "document_code": [
        "123456"
      ],
      "filename": [
        "some filename"
      ]
   },
   "inner_hits": {
     "nested_fields": {
       "hits": {
         "total": 1
         "hits": [
           {
             "fields": {
               "nested_fields.field_one": [ "nested field one value" ],
               "nested_fields.field_two": [ "nested field two value" ]
             }
           } 
         ]
       }
     }
   },
   {
   .....and 10,000 additional hits that dont match the document code
   }
]

It's clear to me why I'm not getting the results that I need. I'm basically saying "give me results that match this document code OR have ANY nested field. Since they all have some value in this nested field, the query returns every single document. I just don't know how to get the response to contain the inner_hits values without matching for something. I've tried to replace the "match_all" in the nested query with match_none and that returns the correct results (the ONE and only document with that document code) with the nested_fields inner_hits but it's empty and I need the values.

Also note that in my actual application, there would be a bunch of other wildcards and matches to search for some keyword throughout different parts of the document.

In short, I'm not actually trying to filter the results based on the nested fields at all - I just need them in the response.

答案1

得分: 0

我想象中文档具有嵌套字段 "nested_fields"。

在这种情况下,您只需要显示嵌套字段和其他字段,您可以进行以下更改:

"stored_fields": [
  "document_code",
  "filename"
],

改为

"_source": {
  "includes": [
    "document_code",
    "filename",
    "nested_fields"
  ]
}
英文:

I imagine that the document has the field nested "nested_fields".

In this case, as you only need to display the nested field and the others, you would make the change below:

"stored_fields": [
  "document_code",
  "filename"
],

To

 "_source": {
    "includes": [
      "document_code",
      "filename",
      "nested_fields"
    ]
  },

huangapple
  • 本文由 发表于 2023年7月28日 05:50:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76783607.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定