如何通过多个关键词筛选搜索查询结果

huangapple go评论85阅读模式
英文:

How to filter search query by multiple keyword

问题

test索引中的当前映射如下:

{
  "mappings": {
    "properties": {
      "tag_ids": {
        "type": "keyword"
      },
      "text": {
        "type": "text"
      }
    }
  }
}

示例文档如下:

#1
{
  "tag_ids":["1","3"],
  "text": "example test"
},
#2
{
  "tag_ids":["1","2"],
  "text": "example test"
},
#3
{
  "tag_ids":["2","3"],
  "text": "example test"
}

现在,如何通过tag_ids搜索文档并按tag_ids进行结果过滤?

如果文档匹配了多个tag_id,则分数应该更高。

使用"match"、"term"还是"terms"?以下是一个示例:

{
  "query": {
    "terms":{
      "tag_ids":["1", "2"]
    }
  }
}

我尝试了使用"terms",但所有文档的分数都是1。

我可以将tag_ids视为文本字段,并将不同的tag_ids视为terms吗?

英文:

current mapping in test index.

{
  "mappings": {
    "properties": {
      "tag_ids": {
        "type":"keyword"
      },
      "text": {
        "type":"text"
      }
    }
  }
}

example documents

#1
{
  "tag_ids":["1","3"],
  "text": "example test"
},
#2
{
  "tag_ids":["1","2"],
  "text": "example test"
},
#3
{
  "tag_ids":["2","3"],
  "text": "example test"
},

now how to search document and filter the result by tag_ids?

if the document matched more tag_id, the score should be higher.

"match","term","terms" which to use?

{
  "query": {
    "terms":{
      "tag_ids":["1", "2"]
    }
  }
}

I tried terms, but all documents have same score 1.

can i treat tag_ids as text field, and different tag_ids as terms.

答案1

得分: 0

如果您使用bool/should子句,并为每个标签ID添加一个term(或match)查询,您将获得具有最高分数的文档#2(因为它包含两个标签ID),然后是文档#1和#3,它们具有较低(但相同)的分数,因为它们都包含一个标签ID。

POST tags/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"tag_ids": "1"
}
},
{
"term": {
"tag_ids": "2"
}
}
]
}
}
}

响应 =>

{
"hits": {
"hits": [
{
"_id": "2",
"_score": 1.1817236,
"_source": {
"tag_ids": [
"1",
"2"
],
"text": "示例测试"
}
},
{
"_id": "1",
"_score": 0.5908618,
"_source": {
"tag_ids": [
"1",
"3"
],
"text": "示例测试"
}
},
{
"_id": "3",
"_score": 0.5908618,
"_source": {
"tag_ids": [
"2",
"3"
],
"text": "示例测试"
}
}
]
}
}

英文:

If you use a bool/should clause and add one term (or match) query for each tag id, you'll get the document #2 with the top score (since it contains both tag ids) and then documents #1 and #3 with a lower (but same) score since they both contain one tag id

POST tags/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "tag_ids": "1"
          }
        },
        {
          "term": {
            "tag_ids": "2"
          }
        }
      ]
    }
  }
}

Response =>

{
  "hits" : {
    "hits" : [
      {
        "_id" : "2",
        "_score" : 1.1817236,
        "_source" : {
          "tag_ids" : [
            "1",
            "2"
          ],
          "text" : "example test"
        }
      },
      {
        "_id" : "1",
        "_score" : 0.5908618,
        "_source" : {
          "tag_ids" : [
            "1",
            "3"
          ],
          "text" : "example test"
        }
      },
      {
        "_id" : "3",
        "_score" : 0.5908618,
        "_source" : {
          "tag_ids" : [
            "2",
            "3"
          ],
          "text" : "example test"
        }
      }
    ]
  }
}

huangapple
  • 本文由 发表于 2023年8月9日 17:44:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76866495.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定