英文:
aiohttp: Should a ClientSession be re-used or created newly foreach request?
问题
I'm using Python 3.11 with aiohttp as a client to POST data to a server. In the end, I'll be posting 3 million single requests to the same endpoint, just with different request bodies.
The server alone is quite fast.
class MyClient:
async def create(single_request_body: dict) -> bool:
"""returns true iff the request was successful"""
async with aiohttp.ClientSession() as session:
async with session.post(
"https://my-server.org/endpoint", data=single_request_body
) as response:
return response.status == 201
Now I'm processing my 3 million POSTs like this:
all_request_bodies: list[dict] = 3_000_000 * {...}
my_client = MyClient()
all_post_tasks = [my_client.create(x) for x in all_request_bodies]
await asyncio.gather(*all_post_tasks)
And it's way too slow.
I'm wondering if this is the right way to do it and I was wondering if:
Would it speed up my application if the ClientSession
was an instance variable of the client and I'd re-use the same session instead of creating new sessions for each POST?
In my first tests, it didn't seem so, but maybe I'm using it wrong?
英文:
I'm using Python 3.11 with aiohttp as a client to POST data to a server. In the end I'll be posting 3 million single requests to the same endpoint, just with different request bodies.
The server alone is quite fast.
class MyClient:
async def create(single_request_body: dict)->bool:
""""returns true iff the request was successful"""
async with aiohttp.ClientSession() as session:
async with session.post(
"https://my-server.org/endpoint", data=single_request_body
) as response:
return response.status == 201
Now I'm processing my 3million POSTs like this:
all_request_bodys: list[dict] = 3_000_000*[{...}]
my_client = MyClient()
all_post_tasks = [my_client.create(x) for x in all_request_bodies]
await asyncio.gather(*all_post_tasks)
And it's way too slow.
I'm wondering if this is the right way to do it and I was wondering if:
Would it speed up my application, if the ClientSession
was an instance variable of the client and I'd re-use the same session instead of creating new sessions for each POST?
In my first tests it didn't seem so, but maybe I'm using it wrong?
答案1
得分: 1
Sure, here is the code with the comments translated:
是的,您的代码确实会导致性能问题。请尝试以下代码:
class MyClient:
def __init__(self):
self.session = aiohttp.ClientSession()
async def create(self, single_request_body: dict) -> bool:
"""如果请求成功,返回True"""
async with self.session.post("https://my-server.org/endpoint", data=single_request_body) as response:
return response.status == 201
async def main():
all_request_bodies = [{}] * 3_000_000 # 请求体的示例列表
my_client = MyClient()
all_post_tasks = [my_client.create(x) for x in all_request_bodies]
await asyncio.gather(*all_post_tasks)
# 运行事件循环
asyncio.run(main())
Please note that the code itself is not translated, only the comments are translated.
英文:
Yes your code definitely lead to performance issues. try this one:
class MyClient:
def __init__(self):
self.session = aiohttp.ClientSession()
async def create(self, single_request_body: dict) -> bool:
""""returns True if the request was successful"""
async with self.session.post("https://my-server.org/endpoint", data=single_request_body) as response:
return response.status == 201
async def main():
all_request_bodies = [{}] * 3_000_000 # Example list of request bodies
my_client = MyClient()
all_post_tasks = [my_client.create(x) for x in all_request_bodies]
await asyncio.gather(*all_post_tasks)
# Run the event loop
asyncio.run(main())
答案2
得分: 1
每个请求都创建一个新会话意味着一个新的TCP连接,包括所有相关的开销。创建会话的目的是为了重用已经建立的连接。话虽如此,
- 通过增加并发打开连接的数量(默认为100),您可以进一步提高性能。请注意,这是特定于服务器的。
connector = aiohttp.TCPConnector(limit=1000) # 根据需要进行调整
self.session = aiohttp.ClientSession(connector=connector)
- 在aiohttp中,会话必须在协程内创建。您可以将其作为参数传递给
create
函数,或者在创建MyClient
之后实例化它。
class MyClient:
def __init__(self):
self.session = None
async def create_session(self):
self.session = aiohttp.ClientSession()
async def create(self, single_request_body: dict) -> bool:
if self.session is None:
raise ValueError("在调用之前创建会话")
async with self.session.post(
"https://my-server.org/endpoint", data=single_request_body
) as response:
return response.status == 201
英文:
Creating a new session for each request implies a new TCP connection, including all the associated overhead. You create a session in order to reuse already established connections. With that said,
- you can further improve the performance by increasing the number of the concurrent open connections (by default 100). Note that this is specific to the server
connector = aiohttp.TCPConnector(limit=1000) # Adjust as needed
self.session = aiohttp.ClientSession(connector=connector)
- in aiohttp, sessions must be created inside a coroutine. You can pass it as an argument to the
create
function or instantiate it after creatingMyClient
class MyClient:
def __init__(self):
self.session = None
async def create_session(self):
self.session = aiohttp.ClientSession()
async def create(self, single_request_body: dict) -> bool:
if self.session is None:
raise ValueError("create the session before calling")
async with self.session.post(
"https://my-server.org/endpoint", data=single_request_body
) as response:
return response.status == 201
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论