英文:
Why does thread pool takes tasks independenty and not concurrently?
问题
我正在努力深入了解线程池的基础知识。我了解到线程池在内部使用阻塞队列来“窃取”任务,并在给定的线程池中运行这些任务。这意味着,如果我有10个任务和5个线程,它一次只能运行5个任务,直到一个任务完全执行完毕。
问题是:为什么不并发运行?为什么不只是对这10个任务进行时间切片?这种实现的原因是什么?
英文:
I am trying to get the basics very strong about thread pool. I learnt that it internally uses blocking queue to 'steal' tasks and run them into given threads in pool. Meaning that if I had 10 tasks and 5 threads, it could run only 5 tasks at the same time, until 1 finishes ENTIRELY.
Question is: Why not concurrently? Why not just time-slice those 10 tasks?
What is the reason of this implementation?
答案1
得分: 5
> 为什么不并发进行?为什么不只是将这10个任务分时处理?
您可以拥有一个能够执行十个并发任务的线程池。您只需要将其配置为至少拥有十个工作线程。"分时处理" 任务是线程所做的。线程池的作用是:
- 允许您的程序控制用于执行“后台”任务的线程数量,
- 允许您的程序重用线程,这比为每个新任务创建一个新线程,然后在任务完成时销毁线程要高效得多。
英文:
> Why not concurrently? Why not just time-slice those 10 tasks?
You can have a thread pool that is able to perform ten concurrent tasks. You just need to configure it to have at least ten worker threads. "Time-slicing" tasks is what threads do. What thread pools do is:
- Allow your program to control the number of threads that it uses to perform "background" tasks, and
- Allow your program to re-use threads, which can be much more efficient than creating a new thread for each new task, and then destroying the thread when the task is complete.
答案2
得分: 3
为了“分时切片10个任务”,这些任务需要在10个单独的线程中并发运行。
分时切片调度算法由操作系统实现,而不是由Java实现。分时切片适用于Java中的线程,因为Java线程被实现为“本机操作系统线程”:每个Java线程都有自己的本机线程,这些线程由操作系统根据其自己的情况进行调度。
在这里,“线程池线程”和“原始线程”之间没有区别。如果将Runnable
的实例提供给线程(无论它是否是线程池的一部分),它将根据操作系统的时间切片调度算法从头到尾运行。
那么为什么不使用成千上万的线程,甚至不需要使用线程池呢?事实证明,操作系统线程是一种相对昂贵且稀缺的资源,因此Java线程也是如此。
由于操作系统线程非常昂贵,Project Loom正在研究向Java中添加轻量级用户空间线程。如果loom被合并到主流Java中,本回答中的某些细节可能会发生变化。
英文:
In order to "time-slice 10 tasks", those tasks need to be in 10 separate threads that run concurrently.
The time-slicing scheduling algorithm is implemented by the operating system, not by Java. Time slicing applies to threads in Java because Java threads are implemented as native operating system threads: every Java thread has a native thread of its own, and these threads are scheduled by the operating system as it sees fit.
There is no difference between "thread pool threads" and "raw threads" here. If you give an instance of Runnable
to a thread (whether it's part of a thread pool or not) it will run from beginning to end, subject to the time slicing scheduling algorithm of the operating system.
So why not use thousands of threads, why even bother with thread pools? It turns out that operating system threads are a relatively expensive and scarce resource, and therefore so are Java threads.
Since operating system threads are so expensive, Project Loom is investigating adding lightweight user space threads to Java. Some of the details in this answer may change when/if loom gets merged into main stream Java.
答案3
得分: 1
一些很好的回答,但我想针对你的问题进行回应。
> 我了解它在内部使用阻塞队列来“窃取”任务,并将它们在给定的线程池中运行。这意味着如果我有10个任务和5个线程,最多只能同时运行5个任务,直到一个完全执行完毕。
如果你将线程池配置为具有5个线程(Executors.newFixedThreadPool(5)
),那么它将启动5个线程来运行你的任务。最初,5个任务被分配给这5个线程以并发运行(如果你的服务器有5个可用的CPU核心)。一旦其中一个任务完成,第6个任务将立即在空闲线程上启动。这将持续进行,直到所有10个任务都被运行完毕。
> 问题是:为什么不并发运行?为什么不简单地将这10个任务进行时间片切片?这个实现的原因是什么?
相反,你可以使用缓存线程池(Executors.newCachedThreadPool()
),如果你想要的话,它会为你提交的每个10个任务分别启动一个线程。这对于10个任务来说可能没问题,但对于100,000个任务来说就不太好了,因为你不会想启动100,000个线程。当我们希望“限制”并发运行的作业数量时,我们会使用固定的线程池。尽管似乎同时运行5个作业总是比同时运行10个作业要慢,但情况并不一定如此。当操作系统在作业之间进行时间片切换时,会产生一些开销,根据你的硬件有多少处理器核心,整体作业的吞吐量可能使用5个线程比10个线程更快。限制并发作业的数量也不会对服务器造成太大压力,应该能使你的应用程序在与其他正在运行的应用程序一起更好地工作。
关于线程扩展的问题,请参阅我在这里的回答:https://stackoverflow.com/questions/17840397/concept-behind-putwait-notify-methods-in-object-class/17841450#17841450
英文:
Some good answers but I thought I'd respond to your questions specifically.
> I learnt that it internally uses blocking queue to 'steal' tasks and run them into given threads in pool. Meaning that if I had 10 tasks and 5 threads, it could run only 5 tasks at the same time, until 1 finishes ENTIRELY.
If you configure your thread pool to have 5 threads (Executors.newFixedThreadPool(5)
) then it will start 5 threads to run your jobs. Initially 5 jobs are given to the 5 threads to run concurrently (if your server has 5 CPUs available). Once one of the 5 jobs finishes, a 6th job will be immediately started on the idle thread. This continues until all 10 jobs have been run.
> Question is: Why not concurrently? Why not just time-slice those 10 tasks? What is the reason of this implementation?
You can instead use a cached thread pool (Executors.newCachedThreadPool()
) if you want which will start a thread for each of the 10 jobs that you submit concurrently. This might work fine for 10 jobs but won't work well with 100,000 jobs – you would not want to start 100,000 threads. We use a fixed thread pool when we want to limit the number of jobs that run concurrently. Even though it seems like running 5 jobs concurrently would always run slower than running 10 jobs at once, but this isn't necessarily the case. There is a cost when the OS time slices between the jobs and the overall job throughput may be faster with 5 threads than 10 depending on how many processors your hardware has. Limiting the number of concurrent jobs also does not stress your server as much and should make your application work better with other running applications.
See my answer here about scaling threads: https://stackoverflow.com/questions/17840397/concept-behind-putting-wait-notify-methods-in-object-class/17841450#17841450
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论