2022年7月17日 06:34:21go评论87阅读模式

英文:

How to decline a request, if another one is already processed for the same user-id?

问题

我正在尝试实现一种同步服务。两个具有不同用户代理的客户端可能同时使用相同的user_id对/sync/user/{user_id}/resource进行POST/PATCH操作。sync应该在数据库中更新具有id={user_id}的用户的数据。

问题是，我不知道如何在另一个请求仍在处理相同user_id的情况下正确拒绝一个Upload请求。我认为使用mutex.Lock()的想法是不好的，因为我将在许多Pod上使用此处理程序，如果在不同的Pod上调用Upload，它对我没有帮助。我应该使用什么同步方法来解决这个问题？我是否应该在数据库中使用一些额外的字段？请给我任何想法！

英文:

I am trying to implement some kind of sync-service.
Two clients with different user-agents may POST/PATCH to /sync/user/{user_id}/resource at the same time with the same user_id. sync should update data for user with id={user_id} in DB.

func (syncServer *SyncServer) Upload(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
    userID := ps.ByName(&quot;user_id&quot;))
    if isAlreadyProcessedForUser(userID) {
       w.WriteHeader(http.StatusConflict)
       return
    }
    ...
    syncServer.db.Update(userID, data)
    ...

}

The problem is I have no idea how to correctly decline one Upload when another one is still processing request for the same user_id. I think the idea to use mutex.Lock() is bad because I will use many pods for this handler and if Upload is called on different pods, it won't help me out anyway. What a synchronization method can I use to solve this problem? Should I use some additional field in DB? I am asking to give me any idea!

答案1

得分: 3

在分布式系统中，有许多方法可以实现分布式锁（distributed locking），以下是我能想到的一些方法：

使用redis（或其他类似的服务）锁。在接收到第一个请求时，你可以对每个user_id进行锁定，并拒绝其他相同user_id的请求，因为你无法对其进行锁定。Redis锁通常具有过期时间，因此不会发生死锁。参考：https://redis.io/docs/reference/patterns/distributed-locks/
使用数据库锁。使用数据库锁需要小心，但一种简单的方法是使用唯一索引：在上传之前，创建一个具有unique(user_id)约束的uploading记录，并在上传后将其删除。有可能忘记/未能删除记录并导致死锁，因此你可能希望在记录中添加另一个expired_at字段，在上传之前检查并删除它。
（针对问题场景）在(user_id, upload_status)上使用唯一约束。这被称为部分索引，你只需要在upload_stats = 'uploading'时检查这个唯一索引。然后，你可以在每个请求上创建一个uploading记录，并拒绝其他请求。还需要设置过期时间，因此你需要跟踪上传的start_time并清理长时间上传的记录。如果你不需要重新获取磁盘空间，可以将记录简单标记为failed，这样你还可以在数据库中跟踪这些上传失败的时间和方式。

注意事项：

看起来你正在使用Kubernetes，因此任何非分布式锁都应谨慎使用，这取决于你想要获得的一致性级别。Pod是易变的，很难依赖本地信息并实现一致性，因为它们可能会被复制/终止/重新调度到另一台机器上。这也适用于具有自动扩展或调度机制的其他平台。
在一个用户拥有的多个客户端和服务器之间进行同步处理时，至少需要处理请求排序、请求去重和最终一致性问题（例如，Google Doc可以支持多人同时编辑）。有一些通用的算法（如操作转换），但这取决于你的具体用例。

英文:

There're many ways to do this (distributed locking) in a distributed system, some I can come up with by far:

Use a redis (or any other similar services) lock . Then you can lock each user_id on receiving the first request and reject other requests for the same user_id because you'll failed to lock it. Redis lock generally has expiration time so you won't deadlock it. Ref: https://redis.io/docs/reference/patterns/distributed-locks/
Use a database lock. You should be careful to use a database lock, but a simple way to do this is with unique index: Create a uploading record with unique(user_id) constraints before upload, and delete it after upload. It's possible to forget/failed to delete the record and cause deadlock, so you might want to add another expired_at field to the record, check & drop it before uploading.
(Specific to the question's scenario) Use a unique constraints on (user_id, upload_status). This is called partial index, and you'll only check this unique index when upload_stats = 'uploading'. Then you can create an uploading record on each request, and reject the other request. Expiration is also needed so you need to track the start_time of uploading and cleanup long-time uploading record. If you don't need to re-claim the disk space you can simply mark the record as failed, by this you can also track when & how these uploads failed in database.

CAUTION:

It seems that you're using Kubernetes, so any non-distributed lock should be used cautiously, depends on the level of consistency you want to acquire. Pods are volatile and it's hard to rely on local information and achieve consistency because they might be duplicated/killed/re-scheduled to another machine. This also applies to any other platforms with auto scaling or scheduling mechanisms.
A syncing process between several clients owned by one user and server needs to handle at least the request ordering, request deduplicating, and eventual consistency issue (e.g. Google Doc can support many people editing at the same time). There're some generic algorithms (like operational transformation) but it depends on your specific use case.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如果已经为相同的用户ID处理了另一个请求，如何拒绝该请求？

问题

答案1

How to read a file starting from a specific line number using Scanner?

高效列出具有大量条目的目录中的文件

通过Tor网络使用Go SSH客户端

如何为beego应用编写测试用例？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论