Elasticsearch: match_phrase_prefix 在搜索查询以 ‘0’ 结尾时无法找到项

huangapple go评论84阅读模式
英文:

Elasticsearch: match_phrase_prefix can't find items when search query ends with '0'

问题

在我的Elastic中,我有这个条目:AB-001-123-B

当运行一个带有match_phrase_prefix查询的查询时,像这样:

"query": {
    "bool": {
        "should": [
            {
                "match_phrase_prefix": {
                    "code": "AB-00"
                }
            }
        ]
    }
}

我没有得到任何结果。
当将代码更改为

"query": {
    "bool": {
        "should": [
            {
                "match_phrase_prefix": {
                    "code": "AB-001"
                }
            }
        ]
    }
}

它返回了该条目。当从代码中删除-00时,它也返回该条目。

我进行了几次其他条目的测试。似乎无法查询以0结尾的搜索短语。

为什么会这样?有没有办法修复这个查询?我尝试过转义,但没有任何效果。

英文:

In my Elastic i'm having this entry: AB-001-123-B

When running a query with match_phrase_prefix like this:

"query": {
        "bool": {
            "should": [
                {
                    "match_phrase_prefix": {
                        "code": "AB-00"
                    }
                }
            ]
        }
    },
...

I don't get any results.
When changing the code to

"query": {
        "bool": {
            "should": [
                {
                    "match_phrase_prefix": {
                        "code": "AB-001"
                    }
                }
            ]
        }
    },
...

It returns the entry. When removing -00 from the code it returns the entry as well.

I did several tests with other entries. It seems like he is not able to query when the search-phrase ends with 0.

Why is that? Is there a way to fix this for the query? I tried escaping, without any effects.

答案1

得分: 1

code字段是一个由standard分析器分析的text字段。这意味着AB-001-123-B会被分析为以下标记:

GET _analyze
{
  "analyzer": "standard",
  "text": "AB-001-123-B"
}

响应 =>
{
  "tokens" : [
    {
      "token" : "ab",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "001",
      "start_offset" : 3,
      "end_offset" : 6,
      "type" : "<NUM>",
      "position" : 1
    },
    {
      "token" : "123",
      "start_offset" : 7,
      "end_offset" : 10,
      "type" : "<NUM>",
      "position" : 2
    },
    {
      "token" : "b",
      "start_offset" : 11,
      "end_offset" : 12,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
}

match_phrase_prefix对于您的用例不太理想。更好的方法是使用prefix查询来查询code.keys字段,像这样:

"query": {
  "bool": {
    "should": [
      {
        "prefix": {
          "code.keys": "AB-001"
        }
      }
    ]
  }
}
英文:

The code field is a text field analyzed by the standard analyzer. This means that AB-001-123-B is analyzed into the following tokens:

GET _analyze
{
  &quot;analyzer&quot;: &quot;standard&quot;,
  &quot;text&quot;: &quot;AB-001-123-B&quot;
}

Response =&gt;
{
  &quot;tokens&quot; : [
    {
      &quot;token&quot; : &quot;ab&quot;,
      &quot;start_offset&quot; : 0,
      &quot;end_offset&quot; : 2,
      &quot;type&quot; : &quot;&lt;ALPHANUM&gt;&quot;,
      &quot;position&quot; : 0
    },
    {
      &quot;token&quot; : &quot;001&quot;,
      &quot;start_offset&quot; : 3,
      &quot;end_offset&quot; : 6,
      &quot;type&quot; : &quot;&lt;NUM&gt;&quot;,
      &quot;position&quot; : 1
    },
    {
      &quot;token&quot; : &quot;123&quot;,
      &quot;start_offset&quot; : 7,
      &quot;end_offset&quot; : 10,
      &quot;type&quot; : &quot;&lt;NUM&gt;&quot;,
      &quot;position&quot; : 2
    },
    {
      &quot;token&quot; : &quot;b&quot;,
      &quot;start_offset&quot; : 11,
      &quot;end_offset&quot; : 12,
      &quot;type&quot; : &quot;&lt;ALPHANUM&gt;&quot;,
      &quot;position&quot; : 3
    }
  ]
}

The match_phrase_prefix is not ideal for your use case. It would be better to query the code.keys field using a prefix query, like this:

&quot;query&quot;: {
        &quot;bool&quot;: {
            &quot;should&quot;: [
                {
                    &quot;prefix&quot;: {
                        &quot;code.keys&quot;: &quot;AB-001&quot;
                    }
                }
            ]
        }
    },
...

huangapple
  • 本文由 发表于 2023年7月18日 16:02:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76710660.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定