How to import data to elasticsearch using golang

huangapple go评论93阅读模式
英文:

How to import data to elasticsearch using golang

问题

我正在使用gopkg.in/olivere/elastic.v5,并尝试使用golang将数据从json文件导入到elasticsearch数据库。这是我的代码:

package main

import (
    "gopkg.in/olivere/elastic.v5"
    "golang.org/x/net/context"
    "log"
    "os"
    "encoding/json"
)

type people struct {
    Firstname    string `json:"firstname"`
    Lastname     string `json:"lastname"`
    Institution  string `json:"institution"`
    Email        string `json:"email"`
}

type item struct {
    Id        string   `json:"id"`
    Title     string   `json:"title"`
    Journal   string   `json:"journal"`
    Volume    int      `json:"volume"`
    Number    int      `json:"number"`
    Pages     string   `json:"pages"`
    Year      int      `json:"year"`
    Authors   []people `json:"authors"`
    Abstract  string   `json:"abstract"`
    Link      string   `json:"link"`
    Keywords  []string `json:"keywords"`
    Body      string   `json:"body"`
}

var client *elastic.Client
var err error

func init() {
    client, err = elastic.NewClient()
    if err != nil {
        log.Fatal(err)
    }
}

func main() {
    var data []item

    file, err := os.Open("data.json")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()

    jsonDecoder := json.NewDecoder(file)
    if err := jsonDecoder.Decode(&data); err != nil {
        log.Fatal("Decode: ", err)
    }

    bulkIndex("library", "article", data)
}

func bulkIndex(index string, typ string, data []item) {
    ctx := context.Background()
    for _, item := range data {
        _, err := client.Index().Index(index).Type(typ).BodyJson(item).Do(ctx)
        if err != nil {
            log.Fatal(err)
        }
    }
}

这个代码可以编译通过,但是在运行后,当我使用GET /library/article/575084573a2404eec25acdcd?pretty575084573a2404eec25acdcd是来自我的json文件的正确id)在kibana上检查我的elasticsearch数据库时,我得到以下响应:

{
  "_index": "library",
  "_type": "article",
  "_id": "575084573a2404eec25acdcd",
  "found": false
}

我该如何导入我的数据?

编辑:在kibana上执行GET /library?pretty后,我得到以下结果:

{
  "library": {
    "aliases": {},
    "mappings": {
      "article": {
        "properties": {
          "abstract": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "authors": {
            "properties": {
              "email": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "firstname": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "institution": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "lastname": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "body": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "id": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "journal": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "keywords": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "link": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "number": {
            "type": "long"
          },
          "pages": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "volume": {
            "type": "long"
          },
          "year": {
            "type": "long"
          }
        }
      }
    },
    "settings": {
      "index": {
        "creation_date": "1486063182258",
        "number_of_shards": "5",
        "number_of_replicas": "1",
        "uuid": "_SLeDWb4QPinFcSwOCUtCw",
        "version": {
          "created": "5020099"
        },
        "provided_name": "library"
      }
    }
  }
}
英文:

I am using gopkg.in/olivere/elastic.v5 and I am trying to import data from a json file to elasticsearch DB using golang. This is my code

package main
import(
"gopkg.in/olivere/elastic.v5"
"golang.org/x/net/context"
"log"
"os"
"encoding/json"
)
type people struct{
Firstname string `json:"firstname"`
Lastname string `json:"lastname"`
Institution string `json:"institution"`
Email string `json:"email"`
}
type item struct{
Id string `json:"id"`
Title string `json:"title"`
Journal  string `json:"journal"`
Volume int `json:"volume"`
Number int `json:"number"`
Pages string `json:"pages"`
Year int `json:"year"`
Authors []people `json:"authors"`
Abstract string `json:"abstract"`
Link string `json:"link"`
Keywords []string `json:"keywords"`
Body string `json:"body"`
}
var client *elastic.Client
var err error
func init(){
client,err = elastic.NewClient()
if err!=nil{
log.Fatal(err)
}
}
func main() {
var data []item
file,err := os.Open("data.json")
if err!=nil{
log.Fatal(err)
}
defer file.Close()
jsonDeocder := 	json.NewDecoder(file)
if err := jsonDeocder.Decode(&data); err!=nil{
log.Fatal("Decode: ",err)
}
bulkIndex("library","article",data)
}
func bulkIndex(index string,typ string ,data []item){
ctx := context.Background()
for _,item := range data{
_,err := client.Index().Index(index).Type(typ).BodyJson(item).Do(ctx)	
if err !=nil{
log.Fatal(err)
}
}	
}

The package documentation is huge and I am not sure if I have gone the right way. This compiles fine but after running this, when I check my elasticsearch DB on kibana using GET /library/article/575084573a2404eec25acdcd?pretty (575084573a2404eec25acdcd is the correct id from my json file), I am getting the following response

{
"_index": "library",
"_type": "article",
"_id": "575084573a2404eec25acdcd",
"found": false
}

How do I import my data?

EDIT: This is what I get on doing GET /library?pretty on kibana

{
"library": {
"aliases": {},
"mappings": {
"article": {
"properties": {
"abstract": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"authors": {
"properties": {
"email": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"firstname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"institution": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lastname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"body": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"journal": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"keywords": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"link": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"number": {
"type": "long"
},
"pages": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"volume": {
"type": "long"
},
"year": {
"type": "long"
}
}
}
},
"settings": {
"index": {
"creation_date": "1486063182258",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "_SLeDWb4QPinFcSwOCUtCw",
"version": {
"created": "5020099"
},
"provided_name": "library"
}
}
}
}

答案1

得分: 2

好的,以下是翻译好的内容:

好的,我明白了。我应该指定我的项目的Id,而不仅仅指定索引和类型。

正确的语句应该是:

_, err := client.Index().Index(index).Type(typ).Id(item.Id).BodyJson(item).Do(ctx)
英文:

Ok, I got it. I should have specified the Id for my item as well instead of just specifying the index and type.

The correct statement should be

_,err := client.Index().Index(index).Type(typ).Id(item.Id).BodyJson(item).Do(ctx)

huangapple
  • 本文由 发表于 2017年2月3日 03:40:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/42010937.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定