如何使用多进程池与列表?

huangapple go评论56阅读模式
英文:

How to use multiprocessing pool with a list?

问题

我正在尝试将以下代码并行化:feret_diamater.py

当我调用 get_min_max_feret_from_labelim() 函数时,传入了一个带标签的图像(例如,1000x1000 数组,标签是数字,介于 0 和 1100 之间),它会为每个标签调用 get_min_max_feret_from_mask() 函数。结果,get_min_max_feret_from_labelim() 返回一个包含 1101 个元素的列表。运行正常,但对于大图像和许多标签的情况,花费了很多很多时间,所以我想使用 multiprocessing Pool 调用 get_min_max_feret_from_mask()。

原始代码使用了这个:

for label in labels:
    results[label] = get_min_max_feret_from_mask(label_im == label)
return results

而我想替换这部分。我尝试了在参数列表中添加 'ncores' 后这样做:

with Pool(ncores) as p:
    for label in labels:
        results[label] = p.map(get_min_max_feret_from_mask, label_im == label)
return results

但这不起作用。我该如何解决这个问题?谢谢。

英文:

I am trying to make the following code parallel: feret_diamater.py

When I call the get_min_max_feret_from_labelim() function with a labeled image (eg. 1000x1000 array, labels are numbers, between 0 and 1100), it calls the get_min_max_feret_from_mask() function for each label. As a result, get_min_max_feret_from_labelim() returns a list of 1101 elements. Works fine, but in case of a big image and many labels, it takes a lot a lot of time, so I want to call the get_min_max_feret_from_mask() using multiprocessing Pool.

The original code uses this:

for label in labels:
    results[label] = get_min_max_feret_from_mask(label_im == label)
return results

And I want to replace this part. I tried this after adding 'ncores' to the parameter list:

with Pool(ncores) as p:
    for label in labels:
        results[label] = p.map(get_min_max_feret_from_mask, label_im == label)
return results

But it is not working. How could I solve this problem? Thank you.

答案1

得分: 0

`Pool.map`函数直接将可迭代对象(列表、元组等)作为第二个参数,无需额外循环:

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map
英文:

You can write:

with Pool(ncores) as p:
    return p.map(get_min_max_feret_from_mask, labels)

Pool.map takes the iterable (list or tuple, etc.) directly as the second parameter, no additional looping required:

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map

答案2

得分: 0

Map 函数期望接收一个参数列表并返回一个结果列表:

    with Pool(ncores) as p:
        result_list = p.map(get_min_max_feret_from_mask, [label_im == label for label in labels])

返回的列表将按照参数列表的顺序匹配,意思是第一个结果是第一个标签的结果,以此类推。

另一种选择是使用 apply_async 来安排任务而不阻塞,它会立即返回一个 AsyncResult 对象,允许您稍后检索结果。这种方法允许您保留原始结构:

    with Pool(ncores) as p:
        for label in labels:
            async_results[label] = p.apply_async(get_min_max_feret_from_mask, (label_im == label,))
    
    results = {label: result.get() for label, result in async_results.items()}
英文:

Map expects a list of arguments and returns a list of results:

with Pool(ncores) as p:
    result_list = p.map(get_min_max_feret_from_mask, [label_im == label for label in labels])

The returned list will match the order of the arguments list, meaning the first result is the result for the first label and so on.

Another option is to use apply_async to schedule tasks without blocking, this immediately returns an AsyncResult object allowing you to retrieve the results later. This one allows you to keep the original structure:

with Pool(ncores) as p:
    for label in labels:
        async_results[label] = p.apply_async(get_min_max_feret_from_mask, (label_im == label,))

results = {label: result.get() for label, result in async_results.items()}

huangapple
  • 本文由 发表于 2023年4月13日 15:20:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76002664.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定