OpenSearch奇怪的搜索行为

huangapple go评论47阅读模式
英文:

OpenSearch Weird Search Behaviour

问题

我在探索opensearch搜索引擎时遇到了奇怪的搜索行为。我的索引中的所有记录如下:

查询1:返回正确的信息

GET /table1/_search
{
  "query": {
    "match": {
      "status": "Complete"
    }
  }
}

查询2:返回所有记录,这是错误的,理想情况下应该只返回4条记录

GET /table1/_search
{
  "query": {
    "match": {
      "period": "JAN-23"
    }
  }
}

查询3:返回4条period为JAN-23的记录,这也是错误的,因为它应该不返回任何记录

GET /table1/_search
{
  "query": {
    "match": {
      "period": "JAN-22"
    }
  }
}

查询4:再次返回所有记录,这也是错误的,应该返回0条记录

GET /table1/_search
{
  "query": {
    "match": {
      "period": "DEC-23"
    }
  }
}

如果有人能帮助我理解为什么会出现这种情况,将非常有帮助。谢谢。

英文:

I encountered a weird search behaviour while exploring opensearch's search engine. All the records within my index are as follows:

[{
        "_index": "table1",
        "_id": "AO4AnIYBC-oD5gl3Hm7W",
        "_score": 1,
        "_source": {
            "period": "JAN-23",
            "requestID": "10273376",
            "header": {
                "period": "JAN-23"
            },
            "status": "Complete"
        }
    },
    {
        "_index": "table1",
        "_id": "gavmkoYB7MbgbX172uOM",
        "_score": 1,
        "_source": {
            "period": "JAN-23",
            "requestID": "100138128",
            "header": {
                "period": "JAN-23"
            },
            "status": "Complete"
        }
    },
    {
        "_index": "table1",
        "_id": "g6vnkoYB7MbgbX17POOY",
        "_score": 1,
        "_source": {
            "period": "FEB-23",
            "requestID": "10246457",
            "header": {
                "period": "FEB-23"
            },
            "status": "Complete"
        }
    },
    {
        "_index": "table1",
        "_id": "hKvnkoYB7MbgbX17XeOw",
        "_score": 1,
        "_source": {
            "period": "JAN-23",
            "requestID": "10273941",
            "header": {
                "period": "JAN-23"
            },
            "status": "Complete"
        }
    },
    {
        "_index": "table1",
        "_id": "_-7nkoYBC-oD5gl3TW1Z",
        "_score": 1,
        "_source": {
            "period": "FEB-23",
            "requestID": "10254951",
            "header": {
                "period": "FEB-23"
            },
            "status": "Complete"
        }
    },
    {
        "_index": "table1",
        "_id": "gqvnkoYB7MbgbX17JONH",
        "_score": 1,
        "_source": {
            "period": "JAN-23",
            "requestID": "10273376",
            "header": {
                "period": "JAN-23"
            },
            "status": "Complete"
        }
    }
]


Here are some of the results that I am getting When querying this data

Query 1: Returns Correct Info

GET /table1/_search
{
  "query": {
    "match": {
      "status": "Complete"
    }
  }
}

Query 2: Returns all records, which is wrong ideally it should only return 4 records

GET /table1/_search
{
  "query": {
    "match": {
      "period": "JAN-23"
    }
  }
}

Query 3: Returns 4 records with period : JAN-23 which is again wrong as it should now return 0 records

GET /table1/_search
{
  "query": {
    "match": {
      "period": "JAN-22"
    }
  }
}

Query 4: Returns all records, which is again wrong, as it should return 0

GET /table1/_search
{
  "query": {
    "match": {
      "period": "DEC-23"
    }
  }
}

It would be really helpful if anyone can help me understand why is it so?

Thanks

答案1

得分: 1

索引表1中的字段period是一个text类型,因此该字段的值将被分析并转换为多个标记,例如JAN-23将被转换为jan23,因此在查询"period": "JAN-23"时,返回包含jan或23的所有包含period字段的文档。要精确搜索文本,我们可以使用term查询来搜索。

映射

"period": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }

返回准确结果的Term查询

GET /table1/_search
{
   "query": {
    "term": {
      "period.keyword": {
        "value": "DEC-22"
      }
    }
   }
}
英文:

The field period in the index table1 is a text due to which the value of this filed will be analysed and converted into multiple tokens, like JAN-23 will be converted to jan and 23, so while querying "period": "JAN-23", all the documents whose period field contains jan or 23 are returned. To search a text exactly, we can use term query to search.

Mapping

"period": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }

Term Query which returned accurate results

GET /table1/_search
{
   "query": {
    "term": {
      "period.keyword": {
        "value": "DEC-22"
      }
    }
   }
}

huangapple
  • 本文由 发表于 2023年3月7日 11:48:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/75657891.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定