英文:
mgo - query performance seems consistently slow (500-650ms)
问题
我的数据层使用了相当多的Mongo聚合操作,平均查询时间为500-650毫秒,我使用的是mgo
库。
下面是一个示例查询函数,代表了我大部分查询的样式:
func (r userRepo) GetUserByID(id string) (User, error) {
info, err := db.Info()
if err != nil {
log.Fatal(err)
}
session, err := mgo.Dial(info.ConnectionString())
if err != nil {
log.Fatal(err)
}
defer session.Close()
var user User
c := session.DB(info.Db()).C("users")
o1 := bson.M{"$match": bson.M{"_id": id}}
o2 := bson.M{"$project": bson.M{
"first": "$first",
"last": "$last",
"email": "$email",
"fb_id": "$fb_id",
"groups": "$groups",
"fulfillments": "$fulfillments",
"denied_requests": "$denied_requests",
"invites": "$invites",
"requests": bson.M{
"$filter": bson.M{
"input": "$requests",
"as": "item",
"cond": bson.M{
"$eq": []interface{}{"$$item.active", true},
},
},
},
}}
pipeline := []bson.M{o1, o2}
err = c.Pipe(pipeline).One(&user)
if err != nil {
return user, err
}
return user, nil
}
我定义的User
结构如下:
type User struct {
ID string `json:"id" bson:"_id,omitempty"`
First string `json:"first" bson:"first"`
Last string `json:"last" bson:"last"`
Email string `json:"email" bson:"email"`
FacebookID string `json:"facebook_id" bson:"fb_id,omitempty"`
Groups []UserGroup `json:"groups" bson:"groups"`
Requests []Request `json:"requests" bson:"requests"`
Fulfillments []Fulfillment `json:"fulfillments" bson:"fulfillments"`
Invites []GroupInvite `json:"invites" bson:"invites"`
DeniedRequests []string `json:"denied_requests" bson:"denied_requests"`
}
根据我提供的信息,有没有明显的原因可以解释为什么我的查询平均需要500-650毫秒?
我知道使用聚合管道可能会带来一些性能损耗,但我不希望它会这么糟糕。
英文:
My data layer uses Mongo aggregation a decent amount, and on average, queries are taking 500-650ms to return. I am using mgo
.
A sample query function is shown below which represents what most of my queries look like.
func (r userRepo) GetUserByID(id string) (User, error) {
info, err := db.Info()
if err != nil {
log.Fatal(err)
}
session, err := mgo.Dial(info.ConnectionString())
if err != nil {
log.Fatal(err)
}
defer session.Close()
var user User
c := session.DB(info.Db()).C("users")
o1 := bson.M{"$match": bson.M{"_id": id}}
o2 := bson.M{"$project": bson.M{
"first": "$first",
"last": "$last",
"email": "$email",
"fb_id": "$fb_id",
"groups": "$groups",
"fulfillments": "$fulfillments",
"denied_requests": "$denied_requests",
"invites": "$invites",
"requests": bson.M{
"$filter": bson.M{
"input": "$requests",
"as": "item",
"cond": bson.M{
"$eq": []interface{}{"$$item.active", true},
},
},
},
}}
pipeline := []bson.M{o1, o2}
err = c.Pipe(pipeline).One(&user)
if err != nil {
return user, err
}
return user, nil
}
The user
struct I have looks like the following..
type User struct {
ID string `json:"id" bson:"_id,omitempty"`
First string `json:"first" bson:"first"`
Last string `json:"last" bson:"last"`
Email string `json:"email" bson:"email"`
FacebookID string `json:"facebook_id" bson:"fb_id,omitempty"`
Groups []UserGroup `json:"groups" bson:"groups"`
Requests []Request `json:"requests" bson:"requests"`
Fulfillments []Fulfillment `json:"fulfillments" bson:"fulfillments"`
Invites []GroupInvite `json:"invites" bson:"invites"`
DeniedRequests []string `json:"denied_requests" bson:"denied_requests"`
}
Based on what I have provided, is there anything obvious that would suggest why my queries are averaging 500-650ms?
I know that I am probably swallowing a bit of a performance hit by using aggregation pipeline, but I wouldn't expect it to be this bad.
答案1
得分: 13
是的,有一个明显的原因可以解释为什么你的查询平均需要500-650毫秒。你在执行每个查询之前都调用了mgo.Dial()
方法。mgo.Dial()
方法需要每次连接到MongoDB服务器,然后在查询之后立即关闭连接。连接的建立可能需要几百毫秒的时间,包括身份验证、分配资源等。这是非常浪费资源的。
根据文档的描述,这个方法通常只在给定的集群上调用一次。然后,可以使用获取到的会话的New
或Copy
方法来建立与同一集群的进一步会话。这样可以共享底层集群,并适当地管理连接池。
你可以创建一个全局的会话变量,在启动时连接一次(例如使用包的init()
函数),然后在代码中重复使用该会话(或者使用Session.Copy()
或Session.Clone()
方法获取该会话的副本/克隆)。以下是一个示例:
var session *mgo.Session
var info *db.Inf // 在这里使用你自己的类型
func init() {
var err error
if info, err = db.Info(); err != nil {
log.Fatal(err)
}
if session, err = mgo.Dial(info.ConnectionString()); err != nil {
log.Fatal(err)
}
}
func (r userRepo) GetUserByID(id string) (User, error) {
sess := session.Clone()
defer sess.Close()
// 现在我们使用sess来执行查询:
var user User
c := sess.DB(info.Db()).C("users")
// 方法的其余部分保持不变...
}
通过这种方式,你只需要在启动时连接一次数据库,然后在每个查询中重复使用会话的副本,避免了每次查询都重新连接的开销。这应该会显著提高查询的性能。
英文:
> .. is there anything obvious that would suggest why my queriers are averaging 500-650ms?
Yes, there is. You are calling mgo.Dial()
before executing each query. mgo.Dial()
has to connect to the MongoDB server every time, which you close right after the query. The connection may very likely take hundreds of milliseconds to estabilish, including authentication, allocating resources (both at server and client side), etc. This is very wasteful.
> This method is generally called just once for a given cluster. Further sessions to the same cluster are then established using the New or Copy methods on the obtained session. This will make them share the underlying cluster, and manage the pool of connections appropriately.
Create a global session variable, connect on startup once (using e.g. a package init()
function), and use that session (or a copy / clone of it, obtained by Session.Copy()
or Session.Clone()
).
For example:
var session *mgo.Session
var info *db.Inf // Use your type here
func init() {
var err error
if info, err = db.Info(); err != nil {
log.Fatal(err)
}
if session, err = mgo.Dial(info.ConnectionString()); err != nil {
log.Fatal(err)
}
}
func (r userRepo) GetUserByID(id string) (User, error) {
sess := session.Clone()
defer sess.Close()
// Now we use sess to execute the query:
var user User
c := sess.DB(info.Db()).C("users")
// Rest of the method is unchanged...
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论