英文:
Should all data from the database be mapped to the model?
问题
我正在使用Go和MongoDB进行REST API的作业。但我还在犹豫:
关于是否应该创建一个字典来在模型层级上存储数据,这将帮助我更快地检索数据,而无需访问MongoDB。但这里的一个大问题是如何同步MongoDB中的数据和我创建的字典中的数据。
在文件models/account.go中,我有一个Account结构体,在MongoDB中我也有一个Account集合来保存网站的所有账户信息。我应该创建AccountList来存储数据库中的所有数据以提高性能吗?
源代码如下:
var AccountList map[int]*Account
type Account struct {
ID int
UserName string
Password string
Email string
Role string
}
英文:
I am doing homework on restAPI using Go and MongoDB. But I'm still wondering:
As for whether I should create a dictionary to store data at the model level, it will help me to retrieve data much faster without accessing MongoDB. But the big problem here is to synchronize the data under MongoDB and in the dictionary that I created.
In file models/account.go I have a struct Account and in MongoDB I also have a collection Account to save all account information of the website. Should I create Accountlist to store all the data in the database to increase performance?.
Source as below:
var AccountList map[int]*Account
type Account struct {
ID int
UserName string
Password string
Email string
Role string
}
答案1
得分: 1
与软件中的许多事物一样,“这取决于”。
关于涉及的系统、数据查询和变更频率等方面的信息不足,无法给出具体答案。但由于这是作业,我们可以给出一些场景。
你的问题的核心是:是否应该缓存数据库的结果?
是否真的需要?
从学术角度来看,过度优化是可以的。你可以使用各种技术并理解它们的工作原理。但在现实世界中,我们应该在实施之前了解是否需要某个功能。解决方案越复杂,正确权衡的重要性就越大。
当你更频繁地使用结果而不是更改结果时,并且从存储中获取数据是昂贵的时候,缓存是最好的选择。
“昂贵”是相对的。一个需要几秒钟的操作可能是昂贵的。但是,如果有数十个、数百个或数千个操作在100毫秒内紧密相连,那么它们也是昂贵的。
如何实现?
你提到了一些缺点。最重要的是:
但是这里的一个大问题是同步MongoDB中的数据和我创建的字典。
对于任何分布式系统来说,同步是最重要的事情。
如果只有一个服务器实例,那么缓存值的方式并不重要。但是一旦你开始添加实例,事情就变得复杂了。
缓存的常见模式是使用分布式键值存储。它们允许你存储可以在应用程序之间共享的结果,并使其无效。
- 应用程序检查存储中是否存在该键。
- 如果存在,则使用它。
- 如果不存在,则从源获取并更新缓存以供下次使用。
- 另外,在需要更新数据时,随时使键无效。
有很多可以使用的产品。Redis 很受欢迎,memcached 也可以使用。但由于你使用的是 Go 语言,请查看 groupcache
:https://github.com/mailgun/groupcache。它是由 Google 编写的,用于简化 dl.google.com
,并由 Mailgun 扩展以支持 TTL(生存时间)。
英文:
As with many things in software, "It Depends".
There's not enough information about the systems involved, how often the data is being queried, mutated, and so on to give a concrete answer. But because this is for homework, we can give scenarios.
The root of your question is this: should you cache results from the database?
Is it really needed?
Academically, it's OK to over-optimize. You get to play with technologies and understand how they work. In the real world, we should understand where the need for something is before implementing it. The more complex a solution is, the more important making a correct trade-off becomes.
Caching is best when you're going to use the results more often than they're going to change, and fetching from storage is expensive.
"Expensive" can vary. One operation measured in seconds can be expensive. But so can tens, hundreds, or thousands of operations close together measured in 100ms.
How should you do it?
You called out a couple drawbacks. Most importantly:
> But the big problem here is to synchronize the data under MongoDB and in the dictionary that I created.
Synchronization is the most important thing for any distributed system.
It doesn't matter how you cache values if you have one server instance. But once you start adding instances, things get complex.
A common pattern for caching is to use a distributed key-value store. They allow you to store results which can be shared across applications — and invalidate them.
- Application checks to see if the key exists in the store.
- If so, use it.
- If not, fetch from origin and update the cache for next time.
- Separately, invalidate the key any time data needs updating.
There are a bunch of products to use. Redis is popular, memcached works. But since you're using Go, checkout groupcache
: https://github.com/mailgun/groupcache. It was written by Google to simplify dl.google.com
, and extended by Mailgun to support TTLs.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论