英文:
How can I distribute multiple concurrent requests to aws lambda functions?
问题
我想构建一个类似于cronjob的系统,从数据库中获取所有用户,并为每个用户进行多个(很多)并发请求,执行一些操作,并将结果保存到数据库中。它将在每天的每个小时运行,全年无休。
我想到了以下解决方案:
- 从数据库中获取所有用户(这是比较容易的部分)。
- 动态创建Lambda函数,并将所有用户分配给这些函数。
- 每个Lambda函数进行并发请求和执行操作(处理结果并将其保存到数据库中)。
- 在需要时使用SNS与这些函数进行通信。
那么,对于这种情况,我的方法是否合理?
这里最重要的是扩展性(这就是为什么我考虑将所有用户分配给Lambda函数,以限制并发请求和资源),我们如何提出一个可扩展和高效的想法来处理指数增长的用户数量?
还有其他建议吗?
英文:
I want to build a cronjob like system that gets all users from database and make multiple (I mean lots of) concurrent requests for each of them and make some executions and save the result to db. It will run every hour in every day 7/24.
I came up with the solution that:
- Gets all users from db (that's the easy part)
- Dynamically creates lambda functions and distributes all users to these functions
- Each lambda function makes concurrent requests and executions
(handling results and saving them to db) - Communicate these functions with SNS when needed
So, does my approach make sense for this situation?
The most important thing here is scaling (that's why I thought to distribute all users to lambda functions, for limiting concurrent requests and resources), how we can come with an scalable and efficient idea for exponentially increased user count?
Or any other suggestions?
答案1
得分: 1
这是我的解决方案:
如果100个并发的Lambda不足以满足您的需求,请创建一个工单来增加您的限制,您将被收取您使用的费用。
然而,您仍然无法确定未来需要多少个Lambda。不必将每个用户都在单独的Lambda中处理,而是可以使用一块用户数据调用一个Lambda。例如,假设您的最大Lambda限制为100,有1000个用户,那么您可以执行以下操作(我不了解go
,这是一个可能不完全符合语法的python
代码)
users = get_users_fromdb() # users = [1,2,3,... 1000]
number_of_users = len(users)
chunk_size = number_of_users / 100 # 100是您的Lambda限制
for i in range(0, number_of_users, chunk_size)
# 例如 chunk_users_data = [1,2,3 ... 10]
chunk_users_data = users[i * chunk_size : (i + 1) * chunk_size ]
invoke_lambda_to_process_users_chunk_data()
以下是您可以在其他Lambda中执行的操作
users = event.get('users')
for user in users:
try:
process_user(user)
except Exception as e:
print(e) # 如果需要,处理异常/错误
更新:
默认情况下,并发运行的Lambda限制为100。如果您有10万个用户,我认为您应该提交一个支持工单,将您的账户并发Lambda限制增加到1000或更多。我正在使用Lambda,我们的限制是1万。还有一件事要记住,不能确定一个Lambda调用是否能够在一个块中处理所有用户,因此在超时之前添加一些逻辑以重新调用剩余的用户。一个Lambda可以运行最多5分钟。您可以从context
对象中以毫秒为单位获取剩余时间。
英文:
Here is my solution:
if 100 concurrent lambdas are not enough for your need, create a ticket to increase your limit, you will be charged what will you use.
However, still you can't determine that how many lambdas will be required in future. It is not necessary to process each user in a separate lambda, instead you can invoke a lambda with a chunk of user data. e.g. Let's say, your max. lambda limit is 100 and there are 1000 users then you can do something (i don't know go
, here is a python
code which may not be 100% syntactically correct)
users = get_users_fromdb() # users = [1,2,3,... 1000]
number_of_users = len(users)
chunk_size = number_of_users / 100 # 100 is your lambda limit
for i in range(0, number_of_users, chunk_size)
# e.g. chunk_users_data = [1,2,3 ... 10]
chunk_users_data = users[i * chunk_size : (i + 1) * chunk_size ]
invoke_lambda_to_process_users_chunk_data()
Here is what you can do in other lambda
users = event.get('users')
for user in users:
try:
process_user(user)
except Exception as e:
print(e) # handle exception / error if you want
Update:
By default, 100 is limit for concurrent running lambdas. If you have 100K users, IMO, you should go for a support case to increase your account's concurrent lambda limit to 1000 or more. I am working on lambda and we have 10K limit. One more thing to keep in mind that it is not sure that your one lambda invocation will be able to process all users in a chunk, so add some logic to reinvoke with remaining users before timeout. A lambda can run upto max. of 5 minutes. YOu can get remaining time from context
object in milli seconds.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论