英文:
Should I limit MaxDegreeOfParallelism because of memory concerns?
问题
我正在使用 Parallel.For
启动大量作业(比如1000个)。这个方法运行良好,但每个作业都相当消耗内存,根据我的观察,Parallel For 启动的并行作业数量比我预期的要多得多。
在我的旧开发计算机上,有4个核心,我看到有400多个正在运行的作业:
这可能没问题,但每个作业都在运行相对消耗内存的算法。因此,程序的内存使用量很高,我怀疑性能受到了内存交换的影响。
目前我没有使用任何 ParallelOptions,只是使用默认设置运行。但我在想是否应该调整 MaxDegreeOfParallelism 来防止内存使用爆炸。或者我是否过于考虑这个问题,Parallel 是否已经智能地考虑了某些因素?
英文:
I am using Parallel.For
to start a large number of jobs (say 1000). This works well, however each job is also quite memory intensive, and from what I can tell Prallel For starts a much higher number of parallel jobs that I would expect.
Running on old home my dev box with 4 cores, I see 400+ ongoing jobs:
This might be fine, however each of these jobs is running a relatively memory intensive algorithm. Therefore the memory usage of the program is high, and I suspect the performance is now impeded due to memory swapping
Currently I am not using any ParallelOptions, just running with the defaults. But I am wondering if I should be adjusting MaxDegreeOfParallelism to keep memory usage from exploding. Or am I overthinking this, and Parallel already takes something way smarter in account?
答案1
得分: 3
以下是您要翻译的内容:
但我在想是否应该调整
MaxDegreeOfParallelism
以防止内存使用量激增。或者我是不是想得太多了,Parallel已经考虑到了更智能的因素?
如果您不提供ParallelOptions
,则将使用默认选项,其中MaxDegreeOfParallelism
设置为-1,即:
如果它是-1,那么没有限制可以同时运行的操作数量(除了
ForEachAsync
方法,其中-1表示ProcessorCount
)。
因此,并行限制将由所使用的任务调度程序提供,如果没有提供,则默认的任务调度程序(TaskScheduler.Default
)将所有内容都发布到线程池,线程池最多可以分配ThreadPool.GetAvailableThreads(out int workerThreads, out int completionPortThreads)
个线程,据我所知,它主要考虑可用的CPU、线程和CPU负载(参见此答案),至少不会直接考虑内存(尽管在极端内存使用情况下,GC可能会消耗大量CPU,从而影响监视资源)。
总之,您需要测试实际的工作负载并相应地进行调整。
英文:
> But I am wondering if I should be adjusting MaxDegreeOfParallelism
to keep memory usage from exploding. Or am I overthinking this, and Parallel already takes something way smarter in account?
If you don't provide the ParallelOptions
then default ones will be used which have MaxDegreeOfParallelism
set to -1, i.e.:
> If it is -1, there is no limit on the number of concurrently running operations (with the exception of the ForEachAsync
method, where -1 means ProcessorCount
).
So parallel the limitations will be provided by task scheduler used, and if none provided the default one (TaskScheduler.Default
) will just post everything to the thread pool which can allocate up to ThreadPool.GetAvailableThreads(out int workerThreads, out int completionPortThreads)
threads and AFAIK it considers mainly available CPUs, threads and CPU load (see this answer) and memory would not be taken into account at least directly (though in case of extreme memory usage GC can consume a lot of CPU affecting the monitored resources).
So in short - you will need to test your actual workloads and adjust accordingly.
答案2
得分: 0
Yes, you should definitely limit the MaxDegreeOfParallelism
to a reasonable value like Environment.ProcessorCount
, not only because of memory usage considerations but also because most likely you want a consistent behavior across subsequent Parallel
executions. The default MaxDegreeOfParallelism
is -1
, which means unlimited parallelism, and in practice saturates the ThreadPool
. A saturated ThreadPool
creates one new thread every second, which means that the effective degree of parallelism of the Parallel
operation increases over time. After the completion of the Parallel
loop the ThreadPool
is no longer saturated, and starts terminating superfluous threads at about the same rate (1/sec). So the effective degree of parallelism of the next Parallel
operation will depend on the duration of the previous Parallel
operation, and on how much time has passed after the completion of the previous Parallel
operation. Basing the behavior of your program on such random non-deterministic factors is unlikely to be your intention.
I have posted here an experimental demonstration that an unconfigured Parallel
execution uses all the available ThreadPool
threads, and keeps asking for more. You could also take a look at this answer, for a more detailed description of the inner workings of the Parallel
class in respect to the MaxDegreeOfParallelism
option.
2: https://stackoverflow.com/questions/1114317/does-parallel-foreach-limit-the-number-of-active-threads/75357873#75357873 "Does Parallel.ForEach limit the number of active threads?"
3: https://stackoverflow.com/questions/9538452/what-does-maxdegreeofparallelism-do/75287075#75287075 "What does MaxDegreeOfParallelism do?"
英文:
Yes, you should definitely limit the MaxDegreeOfParallelism
to a reasonable value like Environment.ProcessorCount
, not only because of memory usage considerations but also because most likely you want a consistent behavior across subsequent Parallel
executions. The default MaxDegreeOfParallelism
is -1
, which means unlimited parallelism, and in practice saturates the ThreadPool
. A saturated ThreadPool
creates one new thread every second, which means that the effective degree of parallelism of the Parallel
operation increases over time. After the completion of the Parallel
loop the ThreadPool
is no longer saturated, and starts terminating superfluous threads at about the same rate (1/sec). So the effective degree of parallelism of the next Parallel
operation will depend on the duration of the previous Parallel
operation, and on how much time has passed after the completion of the previous Parallel
operation. Basing the behavior of your program on such random non-deterministic factors is unlikely to be your intention.
I have posted here an experimental demonstration that an unconfigured Parallel
execution uses all the available ThreadPool
threads, and keeps asking for more. You could also take a look at this answer, for a more detailed description of the inner workings of the Parallel
class in respect to the MaxDegreeOfParallelism
option.
2: https://stackoverflow.com/questions/1114317/does-parallel-foreach-limit-the-number-of-active-threads/75357873#75357873 "Does Parallel.ForEach limit the number of active threads?"
3: https://stackoverflow.com/questions/9538452/what-does-maxdegreeofparallelism-do/75287075#75287075 "What does MaxDegreeOfParallelism do?"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论