英文:
How to use multiprocessing pool with a list?
问题
我正在尝试将以下代码并行化:feret_diamater.py
当我调用 get_min_max_feret_from_labelim() 函数时,传入了一个带标签的图像(例如,1000x1000 数组,标签是数字,介于 0 和 1100 之间),它会为每个标签调用 get_min_max_feret_from_mask() 函数。结果,get_min_max_feret_from_labelim() 返回一个包含 1101 个元素的列表。运行正常,但对于大图像和许多标签的情况,花费了很多很多时间,所以我想使用 multiprocessing Pool 调用 get_min_max_feret_from_mask()。
原始代码使用了这个:
for label in labels:
results[label] = get_min_max_feret_from_mask(label_im == label)
return results
而我想替换这部分。我尝试了在参数列表中添加 'ncores' 后这样做:
with Pool(ncores) as p:
for label in labels:
results[label] = p.map(get_min_max_feret_from_mask, label_im == label)
return results
但这不起作用。我该如何解决这个问题?谢谢。
英文:
I am trying to make the following code parallel: feret_diamater.py
When I call the get_min_max_feret_from_labelim() function with a labeled image (eg. 1000x1000 array, labels are numbers, between 0 and 1100), it calls the get_min_max_feret_from_mask() function for each label. As a result, get_min_max_feret_from_labelim() returns a list of 1101 elements. Works fine, but in case of a big image and many labels, it takes a lot a lot of time, so I want to call the get_min_max_feret_from_mask() using multiprocessing Pool.
The original code uses this:
for label in labels:
results[label] = get_min_max_feret_from_mask(label_im == label)
return results
And I want to replace this part. I tried this after adding 'ncores' to the parameter list:
with Pool(ncores) as p:
for label in labels:
results[label] = p.map(get_min_max_feret_from_mask, label_im == label)
return results
But it is not working. How could I solve this problem? Thank you.
答案1
得分: 0
`Pool.map`函数直接将可迭代对象(列表、元组等)作为第二个参数,无需额外循环:
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map
英文:
You can write:
with Pool(ncores) as p:
return p.map(get_min_max_feret_from_mask, labels)
Pool.map
takes the iterable (list or tuple, etc.) directly as the second parameter, no additional looping required:
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map
答案2
得分: 0
Map
函数期望接收一个参数列表并返回一个结果列表:
with Pool(ncores) as p:
result_list = p.map(get_min_max_feret_from_mask, [label_im == label for label in labels])
返回的列表将按照参数列表的顺序匹配,意思是第一个结果是第一个标签的结果,以此类推。
另一种选择是使用 apply_async
来安排任务而不阻塞,它会立即返回一个 AsyncResult 对象,允许您稍后检索结果。这种方法允许您保留原始结构:
with Pool(ncores) as p:
for label in labels:
async_results[label] = p.apply_async(get_min_max_feret_from_mask, (label_im == label,))
results = {label: result.get() for label, result in async_results.items()}
英文:
Map expects a list of arguments and returns a list of results:
with Pool(ncores) as p:
result_list = p.map(get_min_max_feret_from_mask, [label_im == label for label in labels])
The returned list will match the order of the arguments list, meaning the first result is the result for the first label and so on.
Another option is to use apply_async
to schedule tasks without blocking, this immediately returns an AsyncResult object allowing you to retrieve the results later. This one allows you to keep the original structure:
with Pool(ncores) as p:
for label in labels:
async_results[label] = p.apply_async(get_min_max_feret_from_mask, (label_im == label,))
results = {label: result.get() for label, result in async_results.items()}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论