英文:
I need multiple threadpools, but that's unadvised. How should I substitute that?
问题
在我的代码中,我有一个任务队列,但它们不需要尽快完成,因此我想分配最多N
个线程来完成它们,其余线程我想用于主要软件本身和其处理。请不要固守特定数量的线程,它们只是示例。
所以,例如 - 我可以使用全局线程池最多100
个线程。我想至少分配1个线程,最多分配20
个线程来执行不那么重要的“副任务”,其余线程用于主要处理。有时主要处理只使用20
个线程,有时使用90
个线程。
我在多个地方看到,拥有单独的线程池是不可能的,无论如何我都不应考虑。如果是这种情况,那么我应该如何不同视我的问题?
我并不严格要求有100
个线程(如上面的示例),因此一个解决方案可以是启动20个固定线程来处理队列,然后让主应用程序只使用线程池。总共有120
个线程而不是100
个线程,这是我不关心的,操作系统可能会处理它,但在内部感觉很糟糕。如果我需要更多这种行为怎么办?我要五次使用这个固定线程的方法吗?乍一看听起来相当糟糕。
附注:无论如何,我绝不希望停止处理“不那么重要”的队列,因此至少需要N
个工作者来执行这项任务是必要的。
英文:
In my code, I've got a queue of things I've got to finish, but they're not important to be done as soon as possible, hence I'd like to allocate at most N
threads to finish them—everything else I'd like to use for the main software itself & its processing. Please don't get attached to a specific number of threads, they're just examples.
So, for example - I can use up to 100
threads in the global thread pool. I'd like to just a minimum of 1 and a maximum of 20
threads to do the not-important "side job" and the remaining for the main processing. Sometimes main processing uses only 20
threads, sometimes it uses 90
threads.
I've seen at multiple places that having separate thread pools is impossible and I shouldn't even think about it anyways. If it's the case, then how should I view my problem differently?
I'm not necessarily strict about having 100
threads (for the example above), so a solution can be to fire up 20 fixed threads to work on the queue and then let the main application just use the thread pool. The total of 120
threads instead of 100
is something I don't care about, os will probably handle it, but inside it feels terrible. What if I need more of this behaviour? Am I going to do this fixed-thread hack five times? Sounds pretty lame at first sight.
Ps.: Under no circumstances do I want to stop processing the "not-that-important" queue, thus having at least N
amount of workers for that is a necessity.
答案1
得分: 2
我在多个地方看到有人说要使用单独的线程池是不可能的。
这似乎是System.Threading.ThreadPool类的一个限制,它只有static
方法。这并不是线程池的一般限制。
我不了解C#,所以不知道是否有其他免费可用的线程池实现可供使用。如果情况变得最糟,您可以创建自己的线程池。(这不难。)
我想要为不重要的“附加工作”和主处理保留一些线程。
一个选择: 如果您可以找到或创建一个允许您使用优先队列作为任务队列的线程池实现,您可以将“附加工作”的任务排队时设置比“主处理”的任务低的优先级。
另一个选择:(可能您需要创建自己的线程池)是拥有一个具有_两个_队列的线程池,并使用某种策略选择任务,使得:
-
“主处理”任务通常优先于“附加工作”任务,但
-
具有一些机制以防止在连续提供主处理任务的情况下完全饿死附加工作任务。
英文:
> I've seen at multiple places that having separate thread pools is impossible.
That appears to be a limitation of the System.Threading.ThreadPool class, which only has static
methods. It is not a limitations of thread pools in general.
I do not know C#, so I don't know whether or not there are other freely available thread pool implementations that you could use. If it came to the worst, you could create your own thread pools. (It's not hard.)
> I'd like to [reserve some threads for] the not-important "side job" and the remaining for the main processing.
One option: If you can find or create a thread pool implementation that lets you use a priority queue as the task queue, would be to enqueue the "side job" tasks with lower priority than the "main processing" tasks.
Another option: (probably you would have to create your own thread pool for this) would be to have a single pool of threads with two queues, and some strategy for picking tasks in such a way that;
-
"main processing" tasks are generally prioritized over "side job" tasks, but
-
with some machinery to prevent side job tasks from being completely starved in the event that there is a continuous supply of main processing tasks.
答案2
得分: 2
不好意思,我无法识别代码部分,以下是您要翻译的文本:
TBH我不知道当前关于这个的“官方教条”,但让我给你讲一个小故事:
曾经,我在Java SE中编程,我的工作涉及大量使用专用ThreadPools。
后来我换了一家公司,又开始在C#中编码,哎呀哎呀:自从上次编码以来,引入了Tasks和TPL...我很困惑。
但后来,当我变得更熟悉和自信时,我明白了:我已经放弃了线程。我不再固守“我的线程”的概念。
我顿悟了,我可以使用DataFlow和Parllel.ForEach,同时告诉它们要并行处理多少个任务。如果我设法充分利用了线程池,那么我会有比这段代码更大的问题。而且我不必提供空闲线程。因为我实际上不需要,而且我将不得不进行复杂的高难度动作才能做到。所以我没有这样做。再也没有回头看。
现在针对您的特定情况:您想要在仍然保证任务被处理的情况下“降低优先级”。您的方法是拥有两个不同大小的ThreadPools。
假设我们放弃Threadpool的概念。现在,我会如何解决这个问题,是拥有两个“管道”:
如果我有一组固定大小的任务,我可能会使用2个具有不同maxParallelism设置的Parallel.ForEach。
如果我不知道任务的数量,或者任务是一个持续的流入(可以说是“无限”的),我可能会有两个DataFlow管道(或者一个分成两个,但这是细节...),在其中您甚至可以在每个块中设置并行性。
另一种不同的方法(已经在Solomon的回答中提到)是构建一个优先级系统,确保特定项被降低优先级,同时确保它们不会被饿死。这可能就像有三个队列一样简单:优先级队列、低优先级队列和实际处理队列。然后,您可以以特定的比例进行获取和排队。例如:10个高优先级,然后如果有的话2个低优先级,或者也许是5比1(这大致反映了您期望的100:20比例,但没有空闲线程)。现在,您可以调整有多少个项目将并行处理,同时保持高优先级与低优先级的整体比例。
每种方法如何利用线程(或者不利用线程):我真的不太关心,直到它变成一个问题。如果确实出现问题,我会解决它。
英文:
TBH I don't know the current "official dogma" about this, but let me tell you a little story:
Once I was programming in Java SE and my duties involved using dedicated ThreadPools a lot.
Then I switched to another company and returned to coding in C# and boy oh boy: Since the last time I coded in it, there were Tasks and TPL introduced... I struggled.
But then, when I got a little more familiar and confident, it dawned on me: I had let go of Threads. I didn't cling to the concept of "my thread" anymore.
I came to the enlightenment, that I can use DataFlow and Parllel.ForEach while telling them how much to parallelize tops. And if I managed to max out the threadpool, I'd have bigger problems than this piece of code. And I didn't have to provide idle threads. Because I actually didn't need to and I would have to do sophisticated acrobatics to do it. So I didn't. Never looked back.
Now to your specific situation: you want to _de_prioritize some task, while still guarantee that it is processed. Your approach is to have two differently sized ThreadPools.
Let's say we let go of the Threadpool idea. Now, how I would solve the problem, is to have two "pipelines":
If I have a fixed sized set of tasks, I'd probably use 2 Parallel.ForEach with different maxParallelism settings.
If I don't know the count of tasks or if it is rather a constant influx of tasks ("open end" so to speak), I'd probably have two DataFlow pipelines (or one that splits into two, but that's details...), in which you can even set parallelism in each Block.
Another different approach (already mentioned in Solomon's answer) would be to construct a priority system, that ensures deprioritization of specific items, while ensuring they won't be starved. That could be as easy as having three queues: The prio queue, the low-prio queue and the actual processing queue. Then you could take and enqueue in a specific ratio. For example: 10 prio, then 2 low-prio if any or maybe 5 to 1 (which would about reflect your desired 100:20 ratio but without idle threads). Now you can scale how many of those would be processed in parallel while keeping the overall ratio of prio to low-prio.
How each of those utilize threads (or not): I don't care, really, until it becomes a problem. And if it does, I solve that.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论