如何使用Elasticsearch搜索多语言项。

huangapple go评论64阅读模式
英文:

How to search multi-language items with elasticsearch

问题

Here's the translation of the provided text:

"我有一个使用Globalization Gem的Rails API应用程序。

我有一个Article模型,其中包含id、created_at和updated_at,这个模型有一个article_translations表,其中包含title_entitle_ar,正文和摘要同样适用。

我设置了Elasticsearch,但只能使用一种语言。

这是我的关注文件:

# frozen_string_literal: true

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    mapping do
      # 也尝试使用
      # indexes :title, type: :text

      indexes :title_en, type: :text
      indexes :body_en, type: :text
      indexes :excerpt_en, type: :text
      indexes :title_ar, type: :text
      indexes :body_ar, type: :text
      indexes :excerpt_ar, type: :text
    end

    def self.search(query)
      # 构建并运行搜索
      params = {
        query: {
          bool: {
            should: [
              { match: { title: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { body: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { excerpt: { query: query, boost: 4, fuzziness: "AUTO" } }},
            ]
          }
        },
      }

      self.__elasticsearch__.search(params)
    end
  end
end

现在当我使用Article.search('MyQuery')时,它只搜索一种语言,这种语言是在运行Article.import(force: true)时设置的。

所以,如果我运行上述命令,将I18n.locale = :en,它将索引英文数据,对于阿拉伯文也是一样的。

由于在这里Globalization的工作方式如下:
Article.first.title将返回带有英语区域设置的.title_en和带有阿拉伯区域设置的.title_ar,这是由用户设置的,当然,我们不添加_en_ar,只是title

如果我想要索引两种语言以便在发送查询时能够同时搜索title_entitle_ar,该怎么办?

这是创建的索引:

如何使用Elasticsearch搜索多语言项。

在添加title_entitle_ar之后,如下图所示,但导入后被忽略:

如何使用Elasticsearch搜索多语言项。

提前致谢

英文:

So I have rails API app which uses Globalization Gem

I have and Article model which has id, created_at and updated_at, and this model has article_translations table which contains title_en and title_ar same applies for body and except

I set up Elasticsearch and it's working but with one language only

this is my concern file

# frozen_string_literal: true

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    mapping do
      # Also tried using
      # indexes :title, type: :text

      indexes :title_en, type: :text
      indexes :body_en, type: :text
      indexes :excerpt_en, type: :text
      indexes :title_ar, type: :text
      indexes :body_ar, type: :text
      indexes :excerpt_ar, type: :text
    end

    def self.search(query)
      # build and run search
      params = {
        query: {
          bool: {
            should: [
              { match: { title: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { body: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { excerpt: { query: query, boost: 4, fuzziness: "AUTO" } }},
            ]
          }
        },
      }

      self.__elasticsearch__.search(params)
    end
  end
end

Now when i use Article.search('MyQuery') it searches only on one language, the language which was set when Article.import(force: true) ran

So if i ran the above command which I18n.locale = :en it will index the english data, same for arabic

Since Globalization here works as following
Article.first.title will return .title_en with english locale and .title_ar with arabic locale which is set by user, of course we don't add the _en or the _ar, just title

what if i want to index both languages to be able to search in both title_en and title_ar when i send a query ?

Here is the index created

如何使用Elasticsearch搜索多语言项。

and this is after adding title_en & title_ar but they are ignored after importing

如何使用Elasticsearch搜索多语言项。

Thanks in advance

答案1

得分: 0

Sure, here is the translated content:

"最终,我做到了!"

"我不确定这是否是最佳方式,但现在它正如预期般运作,支持两种语言 :)"

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    # 我不再需要这个了,因为
    # 它在 as_indexed_json 方法上被覆盖了
    #
    # mapping do
    #   indexes :title, type: :text
    #   indexes :title_en, type: :text
    #   indexes :body_en, type: :text
    #   indexes :excerpt_en, type: :text
    #   indexes :title_ar, type: :text
    #   indexes :body_ar, type: :text
    #   indexes :excerpt_ar, type: :text
    # end

    def self.search(query)
      # 构建并运行搜索
      params = {
        query: {
          bool: {
            should: [
              { match: { title_en: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { title_ar: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { body_en: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { body_ar: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { excerpt_en: { query: query, boost: 4, fuzziness: "AUTO" } }},
              { match: { excerpt_ar: { query: query, boost: 4, fuzziness: "AUTO" } }}
            ]
          }
        },
      }

      self.__elasticsearch__.search(params)
    end

    def as_indexed_json(options = nil)
      self.as_json(
        only: %i[title_en title_ar body_en body_ar excerpt_en excerpt_ar],
        methods: %i[title_en title_ar body_en body_ar excerpt_en excerpt_ar]
      )
    end
  end
end
英文:

Finally i did it!

I'm not sure if it's the best way to do it or not but now it's working as it should with both languages 如何使用Elasticsearch搜索多语言项。

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    # I Don't need this anymore because
    # it's overridden on the as_indexed_json method
    #
    # mapping do
    #   indexes :title, type: :text
    #   indexes :title_en, type: :text
    #   indexes :body_en, type: :text
    #   indexes :excerpt_en, type: :text
    #   indexes :title_ar, type: :text
    #   indexes :body_ar, type: :text
    #   indexes :excerpt_ar, type: :text
    # end

    def self.search(query)
      # build and run search
      params = {
        query: {
          bool: {
            should: [
              { match: { title_en: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { title_ar: { query: query, boost: 8, fuzziness: "AUTO" } }},
              { match: { body_en: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { body_ar: { query: query, boost: 6, fuzziness: "AUTO" } }},
              { match: { excerpt_en: { query: query, boost: 4, fuzziness: "AUTO" } }},
              { match: { excerpt_ar: { query: query, boost: 4, fuzziness: "AUTO" } }}
            ]
          }
        },
      }

      self.__elasticsearch__.search(params)
    end

    def as_indexed_json(options = nil)
      self.as_json(
        only: %i[title_en title_ar body_en body_ar excerpt_en excerpt_ar],
        methods: %i[title_en title_ar body_en body_ar excerpt_en excerpt_ar]
      )
    end
  end
end

huangapple
  • 本文由 发表于 2023年5月22日 06:07:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/76302118.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定