使用Python库来自定义Elasticsearch中的过滤器分析器。

huangapple go评论89阅读模式
英文:

use python library for custom filter analyzer in elasticsearch

问题

我想为波斯语文本创建一个“index”,并为其创建词干处理器。以下是如何将“PersianStemmer” Python库实现到Elasticsearch的“analyzer”中的示例:

  1. PUT my_index
  2. {
  3. "settings": {
  4. "analysis": {
  5. "filter": {
  6. "persian_stemmer": {
  7. "type": "stemmer",
  8. "name": "persian"
  9. }
  10. },
  11. "analyzer": {
  12. "persian_analyzer": {
  13. "type": "custom",
  14. "tokenizer": "standard",
  15. "filter": ["lowercase", "persian_stemmer"]
  16. }
  17. }
  18. }
  19. },
  20. "mappings": {
  21. "properties": {
  22. "description": {
  23. "type": "text",
  24. "analyzer": "persian_analyzer"
  25. }
  26. }
  27. }
  28. }

此示例将创建一个名为“persian_analyzer”的自定义分析器,该分析器使用标准分词器,然后应用小写转换和波斯文词干处理器。描述字段使用此分析器进行分析。请确保您已经安装了“PersianStemmer” Python库,并且已将其集成到您的Elasticsearch环境中。

英文:

I want to create an index for persian-language text and I want to create stemmer for that, this is english-stemming for description field

  1. PUT my_index
  2. {
  3. "mappings": {
  4. "properties": {
  5. "description": {
  6. "type": "text",
  7. "analyzer": "english"
  8. }
  9. }
  10. },
  11. "settings": {
  12. "analysis":{
  13. "filter": {
  14. "english_stemmer": {
  15. "type": "stemmer",
  16. "language": "english"
  17. }
  18. }
  19. }
  20. }
  21. }

Now I want to know how can implement the PersianStemmer python library to elasticsearch analyzer?

答案1

得分: 1

你需要为此创建自定义分析器:

  1. PUT my_index
  2. {
  3. "settings": {
  4. "analysis": {
  5. "filter": {
  6. "persian_stemmer": {
  7. "type": "stemmer",
  8. "language": "persian"
  9. }
  10. },
  11. "analyzer": {
  12. "persian_analyzer": {
  13. "tokenizer": "standard",
  14. "filter": [
  15. "lowercase",
  16. "persian_stemmer"
  17. ]
  18. }
  19. }
  20. }
  21. },
  22. "mappings": {
  23. "properties": {
  24. "description": {
  25. "type": "text",
  26. "analyzer": "persian_analyzer"
  27. }
  28. }
  29. }
  30. }
英文:

You need to create custom analyzer for that:

  1. PUT my_index
  2. {
  3. "settings": {
  4. "analysis": {
  5. "filter": {
  6. "persian_stemmer": {
  7. "type": "stemmer",
  8. "language": "persian"
  9. }
  10. },
  11. "analyzer": {
  12. "persian_analyzer": {
  13. "tokenizer": "standard",
  14. "filter": [
  15. "lowercase",
  16. "persian_stemmer"
  17. ]
  18. }
  19. }
  20. }
  21. },
  22. "mappings": {
  23. "properties": {
  24. "description": {
  25. "type": "text",
  26. "analyzer": "persian_analyzer"
  27. }
  28. }
  29. }
  30. }

huangapple
  • 本文由 发表于 2023年2月26日 20:41:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75572044.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定