英文:
OpenSearch Weird Search Behaviour
问题
我在探索opensearch搜索引擎时遇到了奇怪的搜索行为。我的索引中的所有记录如下:
查询1:返回正确的信息
GET /table1/_search
{
"query": {
"match": {
"status": "Complete"
}
}
}
查询2:返回所有记录,这是错误的,理想情况下应该只返回4条记录
GET /table1/_search
{
"query": {
"match": {
"period": "JAN-23"
}
}
}
查询3:返回4条period为JAN-23
的记录,这也是错误的,因为它应该不返回任何记录
GET /table1/_search
{
"query": {
"match": {
"period": "JAN-22"
}
}
}
查询4:再次返回所有记录,这也是错误的,应该返回0条记录
GET /table1/_search
{
"query": {
"match": {
"period": "DEC-23"
}
}
}
如果有人能帮助我理解为什么会出现这种情况,将非常有帮助。谢谢。
英文:
I encountered a weird search behaviour while exploring opensearch's search engine. All the records within my index are as follows:
[{
"_index": "table1",
"_id": "AO4AnIYBC-oD5gl3Hm7W",
"_score": 1,
"_source": {
"period": "JAN-23",
"requestID": "10273376",
"header": {
"period": "JAN-23"
},
"status": "Complete"
}
},
{
"_index": "table1",
"_id": "gavmkoYB7MbgbX172uOM",
"_score": 1,
"_source": {
"period": "JAN-23",
"requestID": "100138128",
"header": {
"period": "JAN-23"
},
"status": "Complete"
}
},
{
"_index": "table1",
"_id": "g6vnkoYB7MbgbX17POOY",
"_score": 1,
"_source": {
"period": "FEB-23",
"requestID": "10246457",
"header": {
"period": "FEB-23"
},
"status": "Complete"
}
},
{
"_index": "table1",
"_id": "hKvnkoYB7MbgbX17XeOw",
"_score": 1,
"_source": {
"period": "JAN-23",
"requestID": "10273941",
"header": {
"period": "JAN-23"
},
"status": "Complete"
}
},
{
"_index": "table1",
"_id": "_-7nkoYBC-oD5gl3TW1Z",
"_score": 1,
"_source": {
"period": "FEB-23",
"requestID": "10254951",
"header": {
"period": "FEB-23"
},
"status": "Complete"
}
},
{
"_index": "table1",
"_id": "gqvnkoYB7MbgbX17JONH",
"_score": 1,
"_source": {
"period": "JAN-23",
"requestID": "10273376",
"header": {
"period": "JAN-23"
},
"status": "Complete"
}
}
]
Here are some of the results that I am getting When querying this data
Query 1: Returns Correct Info
GET /table1/_search
{
"query": {
"match": {
"status": "Complete"
}
}
}
Query 2: Returns all records, which is wrong ideally it should only return 4 records
GET /table1/_search
{
"query": {
"match": {
"period": "JAN-23"
}
}
}
Query 3: Returns 4 records with period : JAN-23
which is again wrong as it should now return 0 records
GET /table1/_search
{
"query": {
"match": {
"period": "JAN-22"
}
}
}
Query 4: Returns all records, which is again wrong, as it should return 0
GET /table1/_search
{
"query": {
"match": {
"period": "DEC-23"
}
}
}
It would be really helpful if anyone can help me understand why is it so?
Thanks
答案1
得分: 1
索引表1中的字段period
是一个text
类型,因此该字段的值将被分析并转换为多个标记,例如JAN-23
将被转换为jan
和23
,因此在查询"period": "JAN-23"
时,返回包含jan或23的所有包含period字段的文档。要精确搜索文本,我们可以使用term
查询来搜索。
映射
"period": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }
返回准确结果的Term查询
GET /table1/_search
{
"query": {
"term": {
"period.keyword": {
"value": "DEC-22"
}
}
}
}
英文:
The field period
in the index table1 is a text
due to which the value of this filed will be analysed and converted into multiple tokens, like JAN-23
will be converted to jan
and 23
, so while querying "period": "JAN-23"
, all the documents whose period field contains jan or 23 are returned. To search a text exactly, we can use term
query to search.
Mapping
"period": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }
Term Query which returned accurate results
GET /table1/_search
{
"query": {
"term": {
"period.keyword": {
"value": "DEC-22"
}
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论