How to search a string in the elasticsearch document(indexed) in golang?

huangapple go评论99阅读模式
英文:

How to search a string in the elasticsearch document(indexed) in golang?

问题

我正在使用golang编写一个函数,用于在已索引的elasticsearch文档中搜索字符串。我正在使用elasticsearch的golang客户端elastic。例如,考虑对象是tweet:

type Tweet struct {
    User    string
    Message string
    Retweets int
}

搜索函数如下:

func SearchProject() error {
    // 使用term查询进行搜索
    termQuery := elastic.NewTermQuery("user", "olivere")
    searchResult, err := client.Search().
        Index("twitter").   // 在索引"twitter"中搜索
        Query(&termQuery).  // 指定查询条件
        Sort("user", true). // 按"user"字段升序排序
        From(0).Size(10).   // 获取0-9的文档
        Pretty(true).       // 格式化打印请求和响应的JSON
        Do()                // 执行搜索
    if err != nil {
        // 处理错误
        panic(err)
        return err
    }

    // searchResult的类型是SearchResult,它返回命中、建议和其他来自Elasticsearch的信息。
    fmt.Printf("查询耗时 %d 毫秒\n", searchResult.TookInMillis)

    // 使用Each函数方便地迭代搜索结果中的每个命中。
    // 它确保你不需要在响应中检查nil值。
    // 但是,它忽略了序列化中的错误。如果你想完全控制迭代命中,请参见下面的示例。
    var ttyp Tweet
    for _, item := range searchResult.Each(reflect.TypeOf(ttyp)) {
        t := item.(Tweet)
        fmt.Printf("用户 %s 的推文:%s\n", t.User, t.Message)
    }
    // TotalHits是另一个方便的函数,即使出现错误也能正常工作。
    fmt.Printf("总共找到 %d 条推文\n", searchResult.TotalHits())

    // 这是如何通过完全控制每个步骤来迭代结果。
    if searchResult.Hits != nil {
        fmt.Printf("总共找到 %d 条推文\n", searchResult.Hits.TotalHits)

        // 迭代结果
        for _, hit := range searchResult.Hits.Hits {
            // hit.Index包含索引的名称

            // 将hit.Source反序列化为Tweet(也可以是map[string]interface{})。
            var t Tweet
            err := json.Unmarshal(*hit.Source, &t)
            if err != nil {
                // 反序列化失败
            }

            // 处理推文
            fmt.Printf("用户 %s 的推文:%s\n", t.User, t.Message)
        }
    } else {
        // 没有命中
        fmt.Print("没有找到推文\n")
    }
    return nil
}

这个搜索函数打印了用户'olivere'的推文。但是,如果我输入'olive',搜索就不起作用。我该如何搜索包含在User/Message/Retweets中的字符串?

索引函数如下:

func IndexProject(p *objects.ElasticProject) error {
    // 索引一个推文(使用JSON序列化)
    tweet1 := `{"user" : "olivere", "message" : "It's a Raggy Waltz"}`
    put1, err := client.Index().
        Index("twitter").
        Type("tweet").
        Id("1").
        BodyJson(tweet1).
        Do()
    if err != nil {
        // 处理错误
        panic(err)
        return err
    }
    fmt.Printf("将推文 %s 索引到索引 %s,类型 %s\n", put1.Id, put1.Index, put1.Type)

    return nil
}

输出:

将推文 1 索引到索引 twitter,类型 tweet
从索引 twitter,类型 tweet 获取文档 1,版本 1
查询耗时 4 毫秒
用户 olivere 的推文:It's a Raggy Waltz
总共找到 1 条推文
总共找到 1 条推文
用户 olivere 的推文:It's a Raggy Waltz

版本:

Go 1.4.2
Elasticsearch-1.4.4

Elasticsearch Go库:

github.com/olivere/elastic

有人可以帮助我吗?谢谢。

英文:

I am writing a function in golang to search for a string in elasticsearch documents which are indexed. I am using elasticsearch golang client elastic. For example consider the object is tweet,

type Tweet struct {
User    string
Message string
Retweets int
}

And the search function is

func SearchProject() error{
// Search with a term query
termQuery := elastic.NewTermQuery("user", "olivere")
searchResult, err := client.Search().
Index("twitter").   // search in index "twitter"
Query(&termQuery).  // specify the query
Sort("user", true). // sort by "user" field, ascending
From(0).Size(10).   // take documents 0-9
Pretty(true).       // pretty print request and response JSON
Do()                // execute
if err != nil {
// Handle error
panic(err)
return err
}
// searchResult is of type SearchResult and returns hits, suggestions,
// and all kinds of other information from Elasticsearch.
fmt.Printf("Query took %d milliseconds\n", searchResult.TookInMillis)
// Each is a convenience function that iterates over hits in a search result.
// It makes sure you don't need to check for nil values in the response.
// However, it ignores errors in serialization. If you want full control
// over iterating the hits, see below.
var ttyp Tweet
for _, item := range searchResult.Each(reflect.TypeOf(ttyp)) {
t := item.(Tweet)
fmt.Printf("Tweet by %s: %s\n", t.User, t.Message)
}
// TotalHits is another convenience function that works even when something goes wrong.
fmt.Printf("Found a total of %d tweets\n", searchResult.TotalHits())
// Here's how you iterate through results with full control over each step.
if searchResult.Hits != nil {
fmt.Printf("Found a total of %d tweets\n", searchResult.Hits.TotalHits)
// Iterate through results
for _, hit := range searchResult.Hits.Hits {
// hit.Index contains the name of the index
// Deserialize hit.Source into a Tweet (could also be just a map[string]interface{}).
var t Tweet
err := json.Unmarshal(*hit.Source, &t)
if err != nil {
// Deserialization failed
}
// Work with tweet
fmt.Printf("Tweet by %s: %s\n", t.User, t.Message)
}
} else {
// No hits
fmt.Print("Found no tweets\n")
}
return nil
}

This search is printing tweets by the user 'olivere'. But if I give 'olive' then search is not working. How do I search for a string which is part of User/Message/Retweets?

And the Indexing function looks like this,

func IndexProject(p *objects.ElasticProject) error {
// Index a tweet (using JSON serialization)
tweet1 := `{"user" : "olivere", "message" : "It's a Raggy Waltz"}`
put1, err := client.Index().
Index("twitter").
Type("tweet").
Id("1").
BodyJson(tweet1).
Do()
if err != nil {
// Handle error
panic(err)
return err
}
fmt.Printf("Indexed tweet %s to index %s, type %s\n", put1.Id, put1.Index, put1.Type)
return nil
}

Output:

Indexed tweet 1 to index twitter, type tweet
Got document 1 in version 1 from index twitter, type tweet
Query took 4 milliseconds
Tweet by olivere: It's a Raggy Waltz
Found a total of 1 tweets
Found a total of 1 tweets
Tweet by olivere: It's a Raggy Waltz

Version

Go 1.4.2
Elasticsearch-1.4.4

Elasticsearch Go Library

github.com/olivere/elastic

Could anyone help me on this.? Thank you

答案1

得分: 3

你搜索和找到数据的方式取决于你的分析器-根据你的代码,很可能使用的是标准分析器(即你在映射中没有指定替代分析器)。

标准分析器只会索引完整的单词。所以要将"olive"与"olivere"匹配,你可以采取以下两种方式:

  1. 改变搜索过程

例如,从术语查询切换到前缀查询,或者使用带有通配符的查询字符串查询

  1. 改变索引过程

如果你想在较大的字符串中查找子字符串,可以考虑在分析器中使用nGramsEdge nGrams

英文:

How you search and find data depends on your analyser - from your code it's likely that the standard analyser is being used (i.e. you haven't specified an alternative in your mapping).

The Standard Analyser will only index complete words. So to match "olive" against "olivere" you could either:

  1. Change the search process

e.g. switch from a term query to a Prefix query or use a Query String query with a wildcard.

  1. Change the index process

If you want to find strings within larger strings then look at using nGrams or Edge nGrams in your analyser.

答案2

得分: 0

multiQuery := elastic.NewMultiMatchQuery(
term,
"name", "address", "location", "email", "phone_number", "place", "postcode",
).Type("phrase_prefix")

英文:
multiQuery := elastic.NewMultiMatchQuery(
term,
"name", "address", "location", "email", "phone_number", "place", "postcode",
).Type("phrase_prefix")

huangapple
  • 本文由 发表于 2015年3月20日 15:43:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/29161705.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定