Solr搜索结果在查询词混淆时发生变化。

huangapple go评论51阅读模式
英文:

Solr search results changes when query terms are jumbled up

问题

我已经索引了一个具有以下字段的文件 -

  1. 内容 (类型:text_general,不可反转:false,索引:true,存储:true)
  2. 类别 (类型:text_general,不可反转:false,索引:true,存储:true)
  3. 标题 (类型:text_general,不可反转:false,索引:true,存储:true)

使用一个通用的 copyfield -

源:*,
目标:_text_

现在,当我在 内容 字段中搜索查询 - Apple trade,我得到 6057 个文档;

但是当我搜索 - trade Apple 时,我得到 5878 个文档。

然而,当在 通用 字段上执行相同的搜索时,对于这两个查询,我得到相同的结果 (6057 个文档)。

我不明白这里的错误,因为我希望 Solr 在 内容 字段上搜索时给出相同的结果。

我正在使用 -

  • LuceneQParser
  • ClassicSimilarity

对于 '内容' 字段的两个查询:

  1. Apple trade

    链接

  2. trade Apple

    链接

英文:

I have indexed a file with fields -

  1. Content (type :text_general, uninvertible :false, indexed :true, stored :true)
  2. Category (type :text_general, uninvertible :false, indexed :true, stored :true)
  3. Title (type :text_general, uninvertible :false, indexed :true, stored :true)

with a catch-all copyfield-

source: *,
dest :_text_

Now when I search Content field, for query - Apple trade , I get 6057 docs;

But when I search - trade Apple , I get 5878 docs.

However when the same search is performed on the catch-all field , I get same result for both the queries (6057 docs).

I am not understanding the mistake here, as I would wish solr to give same result for both queries when searched on Content field.

I am using-

  • LuceneQParser
  • ClassicSimilarity

Two queries on 'Content' Field :

  1. Apple trade

http://localhost:8983/solr/core_name/select?q=Content%3A%20Apple%20trade

  1. trade Apple

http://localhost:8983/solr/core_name/select?q=Content%3A%20trade%20Apple

答案1

得分: 1

根据您刚刚添加到您的问题中的内容,并假设Lucene查询解析器忽略了冒号后面的空格,查询是Content:trade <default search field>:Apple - 您没有在Content字段中同时搜索第一个和第二个术语。

当您交换它们的位置时,您正在搜索Content:Apple <default search field>:trade

默认搜索字段在默认配置中是_text_。由于查询不同,您可以假定字段中包含不同的内容(例如,可能没有正确重新索引并在添加copyField指令后清除索引)。

如果您想要使用易于映射到用户输入的自由文本搜索,请改用edismax查询解析器(defType=edismax),在q=apple trade中提供查询,并在qf=Content中提供字段名称。

英文:

From what you just added to your question and assuming the Lucene query parser ignores the space after your :, the query is Content:trade <default search field>:Apple - you're not searching for both the first and second term in the Content field.

When you swap their places, you're searching for Content:Apple <default search field>:trade.

The default search field is _text_ in the default configuration. Since the queries are different, you can assume that there is different content in the field (for example by not reindexing properly and cleaning out the index after adding the copyField instruction).

If you want to use free text search that easily maps to user input, use the edismax query parser instead (defType=edismax), supply the query in q=apple trade, and supply the field names in qf=Content.

huangapple
  • 本文由 发表于 2020年1月7日 00:00:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/59615220.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定