英文:
Is using too many HttpClient (HttpClientHandler) considered bad design and how to get around it?
问题
我已经阅读了很多关于套接字耗尽问题的信息。总的来说,这从未给我带来过任何问题,我总是尽量遵循建议。但现在我有一个问题。我打算在ASP.net中制作一个服务器应用程序,用户将能够动态实例化和控制MyClient
对象的实例,该对象应该具有一组唯一的HttpClient
和一个HttpClientHandler
。它必须是唯一的,因为会使用特定的CookieContainer
和Proxy
设置。因此,它们将在运行时创建,预先不知道设置。
MyClient
是一个具有特定Http请求集的包装器,为特定服务创建了灵活且深度结构。基本上,它包括:
- 身份,借助Cookie和Proxy设置,用户手动设置。
- 通过外部API进行状态控制,服务从外部来源确定会话状态和其他参数。
- 向外部服务的API发送请求,允许客户端执行操作。
- 动态生命周期。由用户实例化,并在不再需要时销毁。
每个用户都有一组MyClient
实例,虽然不是无限的,但可以从1到1000+个实例并行工作。生命周期从5分钟到几个月不等。为了清晰起见,我将提供一个示例代码,它并不能完全反映整个实质,只是为了理解大致的实现。
// 代码示例
生命周期通过客户端池进行管理。如果客户端处于空闲状态、过期或用户选择禁用客户端,池将处置所有非托管资源。再次强调,此代码非常粗略,仅用于清晰起见。
// 代码示例
实际上问题本身是:在一个服务器上有1000、5000或>50000个实例时,使用这么多的HttpClient
以及HttpClientHandler
会不会创建问题?如果会,是否有解决方案或变通方法?
更新
与 https://github.com/dotnet/runtime/issues/35992 中描述的相同问题。
也许这会澄清我对请求系统的要求。
英文:
I have read a lot about the socket exhaust problem. In general, this never caused me any problems, I always tried to follow the recommendations as much as possible. But now I have a question. I'm going to make a server application in ASP.net where the user will be able to dynamically instantiate and control instances of a MyClient
object which should have a unique set of HttpClient
with an HttpClientHandler
. It must be unique because certain CookieContainer
and Proxy
settings are used. Therefore, they will be created in Runtime, the settings are not known in advance.
<br />MyClient
is a wrapper with a specific set of Http requests, which has a flexible and deep structure created for specific services. Essentially, it includes:
- Identity, thanks to the Cookie and Proxy set, which the user sets manually.
- State control, through an external API, the service determines the session state and other parameters from an external source.
- Requests to the API of an external service, allowing you to perform actions from the client.
- Dynamic life cycle. It is instantiated by the user and it is destroyed as it is no longer needed.
Each user has, although not infinite, but a set of MyClients from 1 to 1000+ instances that will work in parallel. With a life cycle from 5 minutes to months. For clarity, I will give an example code that does not reflect the whole essence, it only serves to understand the approximate implementation.
public class MyClient
{
public HttpClient Client { get; }
public HttpClientHandler Handler { get; }
public int ParameterOfExternalApi1 { get; private set; }
public object ParameterOfExternalApi2 { get; private set; }
public MyClient(CookieCollection cookies, IWebProxy proxy)
{
BuildClient(cookies, proxy);
//
}
private void BuildClient(CookieCollection cookies, IWebProxy proxy)
{
//Building client
Client = builtClient;
}
public async Task MyHttpRequestToExternalAPI()
{
await Client.GetAsync("uri");
}
public async Task<bool> CheckSession()
{
//
}
public void Dispose()
{
}
}
The lifecycle is managed through a pool of clients. If the client is idle, expired, or the user chooses to disable the client, the pool disposes of all unmanaged resources. Again, the code is very approximate, it serves only for clarity.
public static class MyPool
{
private static readonly ConcurrentDictionary<Guid, PooledClient> Pool = new();
public static MyClient? GetClient(Guid guid)
{
return Pool.TryGetValue(guid, out var pooled) ? pooled.Get() : null;
}
public static void Dispose(Guid guid)
{
//
}
}
public class PooledClient
{
private bool _disposed;
private readonly MyClient _client;
public PooledClient(MyClient client)
{
_client = client;
}
public MyClient? Get()
{
return _disposed ? null : _client;
}
public void Dispose()
{
//
}
}
Actually the question itself is: will not the use of so many HttpClients along with HttpClientHandler create problems, especially on 1000, 5000 or >50000 intances on one server? If it does, is there a solution or workaround?
upd
Same issue described in https://github.com/dotnet/runtime/issues/35992
<br/>
Maybe it will clarify my requirenments to the request system.
答案1
得分: 1
每个用户都有一组MyClients实例,虽然不是无限的,但可以并行工作,数量从1到1000+不等,生命周期从5分钟到数月不等。
这太疯狂了。认真地说,这是设计中的关键错误。
真正的解决方案是不允许这种情况发生。因为现在这样,你会耗尽套接字,而且你几乎无能为力。除非改变设计。
首先,全局限制客户端数量至多几十个。最好只有一个,但根据你的具体情况(自定义TLS处理?)可能不可行。其次,一个月的生命周期?究竟有什么东西需要运行这么长时间,需要如此长的生命周期?你的服务器在运行宇宙模拟吗?即使是这样,也极不可能需要保持网络连接打开这么长时间。
如果有很多工作要做,那么将数据分成片段,保存到数据库,并安排工作程序(与服务器不同的进程)逐个挑选片段并处理它们。然后,你可以保持客户端数量较低,并排队等待。你可以更好地控制任务的分发。
此外,你不需要将cookie和代理数据与套接字(包装器)耦合在一起。为什么要这样做?保持设置分开,并根据需要生成HTTP客户端。对HTTP客户端的数量设置一个限制,如果有人试图超过限制,就将其放入等待队列。
英文:
> Each user has, although not infinite, but a set of MyClients from 1 to 1000+ instances that will work in parallel. With a life cycle from 5 minutes to months.
That is insane. Seriously. This is the crucial mistake in the design.
The real solution is to not allow such situation. Because as it is, you will exhaust sockets, and there's not much you can do about it. Except for changing the design.
First of all limit the number of client to dozens at most. Globally, not per user. Preferably just one, but depending on your concrete situation (custom TLS handling?) it might not be possible. Secondly lifetime of a month??? What exactly works so long that it needs such huge lifetime? Does your server run universe simulation? Even if, it is extremely unlikely that you need to keep an open network connection for such a long time.
If there is lots of work to do, then divide the data into pieces, save it to database and schedule workers (meaning process distinct from the server) to pick pieces one by one and process them. And then you can keep the number of clients low, and queued. And you have better control over distribution of tasks.
Also you don't have couple cookie and proxy data with socket (wrappers). Why would you do that? Keep settings separated, and spawn http clients on demand. Put a limit on the number of http clients, and if someone tries to go above put him on a waiting queue.
答案2
得分: 0
让 IHttpClientFactory 管理每个 HttpClient 的生命周期
在上图中,一个 ClientService(由控制器或客户端代码使用)使用由注册的 IHttpClientFactory 创建的 HttpClient。该工厂为 HttpClient 分配一个来自池中的 HttpMessageHandler。可以在将 IHttpClientFactory 注册到 DI 容器时,使用扩展方法 AddHttpClient 来配置 HttpClient 以使用 Polly 的策略。
public class MyClient
{
public HttpClient Client { get; }
public HttpClientHandler Handler { get; }
public int ParameterOfExternalApi1 { get; private set; }
public object ParameterOfExternalApi2 { get; private set; }
public MyClient(IHttpClientFactory httpClientFactory, CookieCollection cookies, IWebProxy proxy)
{
Client = httpClientFactory.CreateClient(); // you can give it a name as well
}
...
}
英文:
Read IHttpClientFactory
Let the IHttpClientFactory manage each HttpClient's lifetime
> In the above image, a ClientService (used by a controller or client code) uses an HttpClient created by the registered IHttpClientFactory. This factory assigns an HttpMessageHandler from a pool to the HttpClient. The HttpClient can be configured with Polly's policies when registering the IHttpClientFactory in the DI container with the extension method AddHttpClient.
public class MyClient
{
public HttpClient Client { get; }
public HttpClientHandler Handler { get; }
public int ParameterOfExternalApi1 { get; private set; }
public object ParameterOfExternalApi2 { get; private set; }
public MyClient(IHttpClientFactory httpClientFactory, CookieCollection cookies, IWebProxy proxy)
{
Client = httpClientFactory.CreateClient(); // you can give it a name as well
}
...
}
答案3
得分: 0
经过一些研究,我可以说 - 目前,没有本地解决此问题的方法。到目前为止,GitHub上的讨论仅限于考虑此问题。您要么需要降低级别并重新发明轮子,要么创建一个可以使服务器尽可能轻松工作但不会完全解决问题的解决方案(或者只需切换到Python并使用 requests
)。
以下是涉及此问题的线程链接:
- https://github.com/dotnet/runtime/issues/77668
- https://github.com/dotnet/runtime/issues/23322
- https://github.com/dotnet/runtime/issues/35992
英文:
After some research I can say - at the moment, there is no native solution to this issue. Discussions on the github so far are limited only to the consideration of this issue. You will either have to go down a level and reinvent the wheel. Or create a solution that will make the server work as easy as possible, but will not solve the problem completely. (or just switch to python and use requests
).
</br>
Here are links to threads that address this issue:
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论