英文:
Substring match in elasticsearch with conditions
问题
以下是您要翻译的部分:
我正在尝试执行一个Elasticsearch查询,我应该得到所有餐馆的子串中包含“pizz”的餐馆名称,但不包含“pizza”和“pizzeria”。
我为此目的编写的查询如下:
GET my_index/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"RestaurantName": {
"value": "*pizz*"
}
}
}
],
"must_not": [
{
"match": {
"RestaurantName": "pizza"
}
},
{
"match": {
"RestaurantName": "pizzeria"
}
}
]
}
}
}
这个查询匹配了类似“Instapizza”的字段,这是错误的。它应该匹配任何组合或大写的情况,如“Fozzie's Pizzaiolo”、“PizzaVito”、“Pizzalicious”。我如何修改查询以避免匹配不需要的字段?对此有任何帮助将非常棒。
英文:
I am trying to perform an Elasticsearch query, where I am supposed to get all restaurants which contain the substring 'pizz' in the restaurant name but do not contain neither 'pizza' nor 'pizzeria'.
The query I wrote for this purpose is this:
GET my_index/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"RestaurantName": {
"value": "*pizz*"
}
}
}
],
"must_not": [
{
"match": {
"RestaurantName": "pizza"
}
},
{
"match": {
"RestaurantName": "pizzeria"
}
}
]
}
}
}
This query matches fields like Instapizza
which is wrong. It should match anything combined or uppercase cases like: Fozzie's Pizzaiolo
, PizzaVito
, Pizzalicious
. How can I fix the query to lose the match for unwanted fields? Any help with this would be really great.
答案1
得分: 2
以下是翻译好的内容:
当您将'RestaurantName'索引为文本字段时,“标准”分析器包括小写过滤器,“小写”标记过滤器使字段不区分大小写,这意味着Lucene中的所有标记都是小写。
首先,您应该为RestaurantName字段添加额外的关键字类型。
{
"mappings": {
"properties": {
"RestaurantName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
使用通配符进行搜索,
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"RestaurantName.keyword": {
"value": "Pizz"
}
}
}
],
"must_not": [
{
"match": {
"RestaurantName": "pizza"
}
},
{
"match": {
"RestaurantName": "pizzeria"
}
}
]
}
}
}
结果是,
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "pizza",
"_type": "_doc",
"_id": "1L6ob4cB6Rdc8HbDY8vi",
"_score": 1.0,
"_source": {
"RestaurantName": "Fozzie's Pizzaiolo"
}
},
{
"_index": "pizza",
"_type": "_doc",
"_id": "1b6ob4cB6Rdc8HbDg8tA",
"_score": 1.0,
"_source": {
"RestaurantName": "PizzaVito"
}
},
{
"_index": "pizza",
"_type": "_doc",
"_id": "1r6ob4cB6Rdc8HbDmMuJ",
"_score": 1.0,
"_source": {
"RestaurantName": "Pizzalicious"
}
}
}
}
}
英文:
When you index 'RestaurantName' as a text field, the "Standard" analyzer includes the lowercase filter, "lowercase" token filter makes fields case-insensitive, which means all tokens in lucene are lowercase.
first, you should add an extra keyword type to RestaurantName field.
{
"mappings": {
"properties": {
"RestaurantName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
search with wildcard,
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"RestaurantName.keyword": {
"value": "*Pizz*"
}
}
}
],
"must_not": [
{
"match": {
"RestaurantName": "pizza"
}
},
{
"match": {
"RestaurantName": "pizzeria"
}
}
]
}
}
}
the result is,
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "pizza",
"_type": "_doc",
"_id": "1L6ob4cB6Rdc8HbDY8vi",
"_score": 1.0,
"_source": {
"RestaurantName": "Fozzie's Pizzaiolo"
}
},
{
"_index": "pizza",
"_type": "_doc",
"_id": "1b6ob4cB6Rdc8HbDg8tA",
"_score": 1.0,
"_source": {
"RestaurantName": "PizzaVito"
}
},
{
"_index": "pizza",
"_type": "_doc",
"_id": "1r6ob4cB6Rdc8HbDmMuJ",
"_score": 1.0,
"_source": {
"RestaurantName": "Pizzalicious"
}
}
]
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论