2015年10月15日 19:04:33go评论158阅读模式

英文:

Optimization of mgo for pathfinding

问题

我已经在Go中实现了一个A*算法，用于在地图上找到两个坐标之间的路径。地图数据是使用mgo从MongoDB集合中获取的。

然而，它非常慢。对于1000米的路线，它需要大约4秒的时间。我已经计时了算法的不同部分，并得出结论，瓶颈在于从数据库获取数据。或者更准确地说，在将二进制数据转换为Go可以理解的数据结构时。

我尽量减少请求的数量，并使用多线程请求来加快速度，这在一定程度上有所帮助。但它远远没有我希望的那么快。

看起来我做错了一些基本的事情。任何建议都将非常有帮助。

MongoDB中的数据结构（从OSM获取的节点）：

{
    "_id": NumberLong(194637483),
    "lat": 55.7079899,
    "lon": 13.3756414,
    "neighbours": [NumberLong(1566264689), NumberLong(1566264806)]
}

Go中的数据结构：

type Node struct {
    ID         int64    `bson:"_id" json:"id"`
    Lat        float64  `bson:"lat" json:"lat"`
    Lon        float64  `bson:"lon" json:"lon"`
    Neighbours []int64  `bson:"neighbours" json:"neighbours"`
}

获取数据的代码示例：

func addToBufferLong(buffer *WriteLockMap, session *mgo.Session, at, goal geo.Node, lowerLat, higherLat, lowerLon, higherLon float64) {
    c := session.DB("collectionName").C("nodes")
    query := c.Find(bson.M{"lat": bson.M{"$gt": lowerLat, "$lt": higherLat}, "lon": bson.M{"$gt": lowerLon, "$lt": higherLon}})

    var nodes []geo.Node
    query.All(&nodes)
    for _, node := range nodes {
        tmp := PathNode{0, node.DistanceTo(goal), 0, node}
        buffer.Lock()
        buffer.m[node.ID] = tmp
        buffer.Unlock()
    }
}

编辑：

多线程策略是基于将我希望查询的区域分成4个不同的正方形，即四个象限，并使用addToBufferLong(...)分别进行查询。

编辑2：

在Stack Overflow的帮助下，我成功地显著降低了执行时间。当前的输出结果如下：

> time GOMAXPROCS=8 ./toystar

Starting A star
Thread A: 0.210783141s
Thread B: 0.380938949s
Thread C: 0.441447793s
Thread D: 0.507361847s

Total database fetch: 0.507543875s
number of fetches: 1
Total A star: 0.826343287s

real 0m0.860s

主要区别在于多线程策略和使用*mgo.Iter而不是query.All(&nodes)。

英文:

I have implemented an A* algorithm in Go to find the path between two coordinates on a map. The map data is fetched with mgo from a MongoDB collection.

It is however very slow. It sits around 4 seconds for a 1000 meter route. I have timed the different parts of the algorithm and concluded that the bottle neck is in the fetching from the database. Or to be precise: in the conversion from binary data to a data structure that Go understands.

I try to do as few requests as possible, and multi-thread the requests to make it faster, and it helps to a certain extent. But it's not nearly as fast as I wish it was.

It seems like I'm doing something fundamentally wrong. Any suggestions at all would be helpful.

Data structure in mongoDB: (Nodes fetched from OSM)

{ &quot;_id&quot; : NumberLong(194637483),
  &quot;lat&quot; : 55.7079899, 
  &quot;lon&quot; : 13.3756414,
  &quot;neighbours&quot; : [ NumberLong(1566264689), NumberLong(1566264806) ] 
}

Data structure in Go

type Node struct {
    ID         int64   `bson:&quot;_id&quot; json:&quot;id&quot;`
    Lat        float64 `bson:&quot;lat&quot; json:&quot;lat&quot;`
    Lon        float64 `bson:&quot;lon&quot; json:&quot;lon&quot;`
    Neighbours []int64 `bson:&quot;neighbours&quot; json:&quot;neighbours&quot;`
}

Code for getting a piece of data:

func addToBufferLong(buffer *WriteLockMap, session *mgo.Session, at, goal geo.Node, lowerLat, higherLat, lowerLon, higherLon float64) {
	c := session.DB(&quot;collectionName&quot;).C(&quot;nodes&quot;)
	query := c.Find(bson.M{&quot;lat&quot;: bson.M{&quot;$gt&quot;: lowerLat, &quot;$lt&quot;: higherLat}, &quot;lon&quot;: bson.M{&quot;$gt&quot;: lowerLon, &quot;$lt&quot;: higherLon}})

	var nodes []geo.Node
	query.All(&amp;nodes)
	for _, node := range nodes {
		tmp := PathNode{0, node.DistanceTo(goal), 0, node}
		buffer.Lock()
		buffer.m[node.ID] = tmp
		buffer.Unlock()
	}
}

Edit:

The multi-threading strategy is based on splitting the area I wish to query into 4 different squares, quadrants if you will, and doing them separately with addToBufferLong(...)

Most recent print-outs:

&gt; time GOMAXPROCS=8 ./toystar
Starting A star
Thread A, count: 19657, db-fetch: 0.122792104s, parsing: 0.934650055s
Thread B, count: 19574, db-fetch: 0.274384302s, parsing: 1.196350664s
Thread C, count: 4197, db-fetch: 0.388197823s, parsing: 0.930109241s
Thread D, count: 9900, db-fetch: 0.540008325s, parsing: 0.93963728s

Total database fetch:  1.534268099 s
Total A star (with fetches):  1.854748244

real	0m1.907s

where db-fetch measures the time it takes to do the row that starts with query := c.Find(...) and parsing measures the time it takes to do the query.All(&nodes)

Edit 2:

I've managed to drop the execution time significantly, with the help of fellow stack overflow users. Current print-outs:

&gt; time GOMAXPROCS=8 ./toystar

Starting A star
Thread A: 0.210783141s
Thread B: 0.380938949s
Thread C: 0.441447793s
Thread D: 0.507361847s

Total database fetch:  0.507543875 s
number of fetches:  1
Total A star:  0.826343287s

real	0m0.860s

The main difference being the multi-threading strategy and using *mgo.Iter instead of query.All(&nodes)

答案1

得分: 2

根据可用信息，我可以得出以下结论：

您的机器似乎很慢，可能是嵌入式设备？

您所调用的db-fetch阶段实际上并没有访问数据库。c.Find(...)只是构建了一个*mgo.Query值。该方法只有6行代码，不应该花费超过一毫秒的时间。除非数据库对象的内部会话状态存在争用，但由于您只使用了4个goroutine，这似乎不是问题。

瓶颈可能是网络和/或反射

query.All(&nodes)是实际执行数据库查询的地方。此外，这个阶段会分配所需的节点切片，并通过反射将bson逐个解码为您的结构定义。

你可以尝试使用*mgo.iter

您可以使用query.Iter()而不是query.All(...)来获取一个*mgo.Iter，并逐批迭代您的数据集。这可能通过更好地分配网络负载来提高性能。

使用地理空间索引，MongoDB支持

请参阅文档。也许您已经在使用了。如果没有，这可能会提高查找时间。

将您的象限分割成子象限，递归进行处理。

我认为这一点是显而易见的。分而治之，对吧？

英文:

From the information available, this is what I can deduce:

Your machine seems to be slow, maybe an embedded device?

The stage you called db-fetch doesn't actually access the database. The only thing c.Find(...) does is building a *mgo.Query value. The method is 6 lines long. This shouldn't take more than a millisecond. Unless there's contention on the database object's internal session state, which doesn't seem to be the case because you're using only 4 goroutines.

Bottleneck seems to be network and/or reflection

query.All(&nodes) is where the query is actually executed on your database. In addition, this phase allocates the slice of nodes it needs and then iteratively decodes bson into your struct definition via reflection.

One thing you could try: *mgo.iter

Instead of query.All(...) you could use query.Iter() to obtain a *mgo.Iter and iterate batch-wise over your data sets. This might improve performance by better distributing network load over time.

Use geospacial indexing, Mongo supports it

See the documentation. Maybe you're already doing this. If not, it might improve lookup time.

Split up your quadrants into subquadrants, recursively.

I think this one is obvious. Divide and conquer, right?

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

优化MGO用于路径规划

问题

编辑：

编辑2：

Edit:

Edit 2:

答案1

How can I initialize struct with values from array of interface in Go?

Create Zip with java program and while unzip using Golang, getting issue as "zlib: invalid header"

http.Handle在Golang中无法工作。

支持Go语言的NoSQL数据库列表

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论