英文:
Golang web-server and concurrent approach for an in memory cache
问题
我对编程并不陌生,但对于golang还比较新手,对于golang的并发处理方式还不太熟悉。
一般的设置如下:
- Web服务器(应该快速且并行),所以我使用net/http包。
- 我需要存储和检索大量的文档。检索操作比存储操作更频繁,但比例相对较低,大约为20。
- 在检索操作中,最重要的是最近存储的文档。其余的文档可以在需要时从磁盘/数据库中检索。
- 解决方案:使用内存缓存最近添加的项目。
- 注意:在检索操作中,我不关心最近3秒的文档。也就是说,如果在时间点A,我请求最近添加的项目的完整列表,那么在最近3秒内添加的项目可能会部分或完全丢失。但是在时间点A+3秒再次请求时,所有这些项目应该都在列表中。
我的问题与如何实现内存缓存有关。
朴素的方法 #1 (RWLock)
- 在内存中有一个大的项目列表。
- 使用读写锁来保护它。
这种方法的问题是:我成功地使Web服务器串行化了
好的,请忘记这种方法。
方法 #2:拆分事物
- 在内存中有X个列表(每个列表都有一个读写锁)。
- 在http处理程序开始时,获取一个随机数,并选择其中一个X列表,只在该列表上进行操作。
- 每2.5秒启动另一个收集器例程,收集并合并这些列表。
这种方法更好,理论上甚至可以将工作分配给多台服务器。
但是,例如基于golang tour代码:
func main() {
http.HandleFunc("/view/", makeHandler(viewHandler))
http.HandleFunc("/edit/", makeHandler(editHandler))
http.HandleFunc("/save/", makeHandler(saveHandler))
http.ListenAndServe(":8080", nil)
}
在http处理程序中如何传递/获取一个新的随机数而不进行串行化?
- 它不需要具有密码学安全性。我只是想用它来选择其中一个X列表。
- 我知道有一个全局随机生成器,但它在内部使用了互斥锁,所以回到了起点。
- 我可以要求客户端(JavaScript)提供一个随机数作为GET参数。但这听起来是否危险(DOS攻击)?或者这样做是否可以接受?
- 在go服务器中,我可能不知道用户的IP地址(反向代理设置)。
并且:**总的来说,这是一个好方法吗?有更好的方法吗?**现在我将自己限制在X上,这样无法扩展。如果我想在运行时更改X,如何告诉处理程序这个变化(而不再进行串行化)?
英文:
I am not new to programming, but I am relatively new to golang and still not completely used to the golang concurrency approach.
The general set-up:
- Web server (should be fast and parallel), so I use net/http
- I need to store and retrieve lots of documents. While retrieving happens more often than storing, the factor is rather low. Maybe 20.
- When retrieving the, by far, most important are the lastly stored documents. The rest can be retrieved just from the disk/DB if needed.
- Solution: In memory cache of last added items.
- Note: On retrieval, I don't care about the last 3 seconds. Meaning, if, at time (A), I ask for a complete list of the last added items, the items added in the last 3 seconds can (partially or completely) be missing. But when asking again at time (A+3s) all those items should be in the list.
My question is related to how to implement the in memory cache.
Naive approach #1 (RWLock)
- Have a big list of items in memory.
- Guard it with a RW lock
Problem with this approach: I successfully serialized the web server
OK, please forget about this approach.
Approach #2: Split things up
- have X lists in memory (each with RWLock)
- on http handler start get a random number and chose one of the X lists, work only on that list
- Another collector routine is started every 2.5 seconds collecting and combining the lists
This is better, I theoretically could even split the work between servers.
But, for example based on the golang tour code:
func main() {
http.HandleFunc("/view/", makeHandler(viewHandler))
http.HandleFunc("/edit/", makeHandler(editHandler))
http.HandleFunc("/save/", makeHandler(saveHandler))
http.ListenAndServe(":8080", nil)
}
How do I pass/get a new random number in the http handler without serializing?
- It does not need to be cryptographically secure. I just want to use it to pick one of the X lists.
- I know there is a global random generator but that uses a mutex internally, so back to square 1.
- I could ask the clients (JavaScript) to provide a random number as get parameter. But that sounds dangerous (DOS)? Or is this OK?
- I might not know the users IP address in the go server (reverse proxy setup).
And: Generally is this a good approach? Is there a better way? And now I am limiting myself to X, this does not scale. If I want X to change during run-time, how could I tell the handlers about that change (without becoming serial again)?
答案1
得分: 1
你不需要真正使用RWLock对服务器进行序列化。使用RLock()
来并行读取文档。
可以查看Go语言的线程安全并发映射库。它利用了互斥锁和分片技术。我还建议在数据库层面上使用CQRS,这样可以轻松处理每秒10万个并发请求。
英文:
You don't really serialize your server with RWLock. Use RLock()
for parallel read of documents.
Check on thread-safe concurrent map for go library. It utilizes mutex and sharding technics alongside. I would also added CQRS to database level and It could easily handle 100K concurrent requests/sec.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论