优化MGO用于路径规划

huangapple go评论158阅读模式
英文:

Optimization of mgo for pathfinding

问题

我已经在Go中实现了一个A*算法,用于在地图上找到两个坐标之间的路径。地图数据是使用mgo从MongoDB集合中获取的。

然而,它非常慢。对于1000米的路线,它需要大约4秒的时间。我已经计时了算法的不同部分,并得出结论,瓶颈在于从数据库获取数据。或者更准确地说,在将二进制数据转换为Go可以理解的数据结构时。

我尽量减少请求的数量,并使用多线程请求来加快速度,这在一定程度上有所帮助。但它远远没有我希望的那么快。

看起来我做错了一些基本的事情。任何建议都将非常有帮助。

MongoDB中的数据结构(从OSM获取的节点):

{
    "_id": NumberLong(194637483),
    "lat": 55.7079899,
    "lon": 13.3756414,
    "neighbours": [NumberLong(1566264689), NumberLong(1566264806)]
}

Go中的数据结构:

type Node struct {
    ID         int64    `bson:"_id" json:"id"`
    Lat        float64  `bson:"lat" json:"lat"`
    Lon        float64  `bson:"lon" json:"lon"`
    Neighbours []int64  `bson:"neighbours" json:"neighbours"`
}

获取数据的代码示例:

func addToBufferLong(buffer *WriteLockMap, session *mgo.Session, at, goal geo.Node, lowerLat, higherLat, lowerLon, higherLon float64) {
    c := session.DB("collectionName").C("nodes")
    query := c.Find(bson.M{"lat": bson.M{"$gt": lowerLat, "$lt": higherLat}, "lon": bson.M{"$gt": lowerLon, "$lt": higherLon}})

    var nodes []geo.Node
    query.All(&nodes)
    for _, node := range nodes {
        tmp := PathNode{0, node.DistanceTo(goal), 0, node}
        buffer.Lock()
        buffer.m[node.ID] = tmp
        buffer.Unlock()
    }
}

编辑:

多线程策略是基于将我希望查询的区域分成4个不同的正方形,即四个象限,并使用addToBufferLong(...)分别进行查询。

最近的输出结果:

> time GOMAXPROCS=8 ./toystar
Starting A star
Thread A, count: 19657, db-fetch: 0.122792104s, parsing: 0.934650055s
Thread B, count: 19574, db-fetch: 0.274384302s, parsing: 1.196350664s
Thread C, count: 4197, db-fetch: 0.388197823s, parsing: 0.930109241s
Thread D, count: 9900, db-fetch: 0.540008325s, parsing: 0.93963728s

Total database fetch: 1.534268099s
Total A star (with fetches): 1.854748244s

real 0m1.907s

其中,db-fetch表示执行以query := c.Find(...)开头的行所花费的时间,parsing表示执行query.All(&nodes)所花费的时间。

编辑2:

在Stack Overflow的帮助下,我成功地显著降低了执行时间。当前的输出结果如下:

> time GOMAXPROCS=8 ./toystar

Starting A star
Thread A: 0.210783141s
Thread B: 0.380938949s
Thread C: 0.441447793s
Thread D: 0.507361847s

Total database fetch: 0.507543875s
number of fetches: 1
Total A star: 0.826343287s

real 0m0.860s

主要区别在于多线程策略和使用*mgo.Iter而不是query.All(&nodes)

英文:

I have implemented an A* algorithm in Go to find the path between two coordinates on a map. The map data is fetched with mgo from a MongoDB collection.

It is however very slow. It sits around 4 seconds for a 1000 meter route. I have timed the different parts of the algorithm and concluded that the bottle neck is in the fetching from the database. Or to be precise: in the conversion from binary data to a data structure that Go understands.

I try to do as few requests as possible, and multi-thread the requests to make it faster, and it helps to a certain extent. But it's not nearly as fast as I wish it was.

It seems like I'm doing something fundamentally wrong. Any suggestions at all would be helpful.

Data structure in mongoDB: (Nodes fetched from OSM)

{ "_id" : NumberLong(194637483),
  "lat" : 55.7079899, 
  "lon" : 13.3756414,
  "neighbours" : [ NumberLong(1566264689), NumberLong(1566264806) ] 
}

Data structure in Go

type Node struct {
    ID         int64   `bson:"_id" json:"id"`
    Lat        float64 `bson:"lat" json:"lat"`
    Lon        float64 `bson:"lon" json:"lon"`
    Neighbours []int64 `bson:"neighbours" json:"neighbours"`
}

Code for getting a piece of data:

func addToBufferLong(buffer *WriteLockMap, session *mgo.Session, at, goal geo.Node, lowerLat, higherLat, lowerLon, higherLon float64) {
	c := session.DB("collectionName").C("nodes")
	query := c.Find(bson.M{"lat": bson.M{"$gt": lowerLat, "$lt": higherLat}, "lon": bson.M{"$gt": lowerLon, "$lt": higherLon}})

	var nodes []geo.Node
	query.All(&nodes)
	for _, node := range nodes {
		tmp := PathNode{0, node.DistanceTo(goal), 0, node}
		buffer.Lock()
		buffer.m[node.ID] = tmp
		buffer.Unlock()
	}
}

Edit:

The multi-threading strategy is based on splitting the area I wish to query into 4 different squares, quadrants if you will, and doing them separately with addToBufferLong(...)

Most recent print-outs:

> time GOMAXPROCS=8 ./toystar
Starting A star
Thread A, count: 19657, db-fetch: 0.122792104s, parsing: 0.934650055s
Thread B, count: 19574, db-fetch: 0.274384302s, parsing: 1.196350664s
Thread C, count: 4197, db-fetch: 0.388197823s, parsing: 0.930109241s
Thread D, count: 9900, db-fetch: 0.540008325s, parsing: 0.93963728s

Total database fetch:  1.534268099 s
Total A star (with fetches):  1.854748244

real	0m1.907s

where db-fetch measures the time it takes to do the row that starts with query := c.Find(...) and parsing measures the time it takes to do the query.All(&nodes)

Edit 2:

I've managed to drop the execution time significantly, with the help of fellow stack overflow users. Current print-outs:

> time GOMAXPROCS=8 ./toystar

Starting A star
Thread A: 0.210783141s
Thread B: 0.380938949s
Thread C: 0.441447793s
Thread D: 0.507361847s

Total database fetch:  0.507543875 s
number of fetches:  1
Total A star:  0.826343287s

real	0m0.860s

The main difference being the multi-threading strategy and using *mgo.Iter instead of query.All(&nodes)

答案1

得分: 2

根据可用信息,我可以得出以下结论:

  1. 您的机器似乎很慢,可能是嵌入式设备?

您所调用的db-fetch阶段实际上并没有访问数据库。c.Find(...)只是构建了一个*mgo.Query值。该方法只有6行代码,不应该花费超过一毫秒的时间。除非数据库对象的内部会话状态存在争用,但由于您只使用了4个goroutine,这似乎不是问题。

  1. 瓶颈可能是网络和/或反射

query.All(&nodes)是实际执行数据库查询的地方。此外,这个阶段会分配所需的节点切片,并通过反射将bson逐个解码为您的结构定义。

  1. 你可以尝试使用*mgo.iter

您可以使用query.Iter()而不是query.All(...)来获取一个*mgo.Iter,并逐批迭代您的数据集。这可能通过更好地分配网络负载来提高性能。

  1. 使用地理空间索引,MongoDB支持

请参阅文档。也许您已经在使用了。如果没有,这可能会提高查找时间。

  1. 将您的象限分割成子象限,递归进行处理。

我认为这一点是显而易见的。分而治之,对吧? 优化MGO用于路径规划

英文:

From the information available, this is what I can deduce:

  1. Your machine seems to be slow, maybe an embedded device?

The stage you called db-fetch doesn't actually access the database. The only thing c.Find(...) does is building a *mgo.Query value. The method is 6 lines long. This shouldn't take more than a millisecond. Unless there's contention on the database object's internal session state, which doesn't seem to be the case because you're using only 4 goroutines.

  1. Bottleneck seems to be network and/or reflection

query.All(&nodes) is where the query is actually executed on your database. In addition, this phase allocates the slice of nodes it needs and then iteratively decodes bson into your struct definition via reflection.

  1. One thing you could try: *mgo.iter

Instead of query.All(...) you could use query.Iter() to obtain a *mgo.Iter and iterate batch-wise over your data sets. This might improve performance by better distributing network load over time.

  1. Use geospacial indexing, Mongo supports it

See the documentation. Maybe you're already doing this. If not, it might improve lookup time.

  1. Split up your quadrants into subquadrants, recursively.

I think this one is obvious. Divide and conquer, right? 优化MGO用于路径规划

huangapple
  • 本文由 发表于 2015年10月15日 19:04:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/33146832.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定