2023年5月13日 18:40:02go评论164阅读模式

英文:

Multiprocessed reduce function hangs when task queue size of above 1200 elements in Python

问题

这个程序在to_sum列表大小达到约1150时工作正常。之后，进程将在task_queue = result_queue的第一个点挂起。它们将成功填充结果队列并终止，但然后会挂起。如果数组大小低于1150，此问题将不会发生。有没有任何可能导致这个问题的想法？

英文:

the program works fine untill the size of the to_sum list reaches about 1150. Afterwards, the processes will hang at the first point where task_queue = result_queue. They will successfully fill the result queue and terminate, but then will hang. This problem does not happen if the size of the array is below 1150.
Restarting the computer sometimes allows for the program to work with an array of bigger size before it hangs due to size, but it is always around the 1100-1300 range.
Do you have any idea what could cause this problem?

    import multiprocessing

class CustomProcess(multiprocessing.Process):

    def __init__(self, name, task_queue, result_queue, lock, chunks=2, *args, **kwargs):
        super().__init__(name=name, *args, **kwargs)
        self.name = name
        self.task_queue = task_queue
        self.result_queue = result_queue
        self.chunks = chunks
        self.lock = lock

    def run(self):
        while True:
            &quot;&quot;&quot;
            Using a lock to avoid a race condition where 2 threads both get a number, then
            both to get another, but it is None so they both put the number back resulting in a result queue
            with bigger size. For example:
            Expected result_queue_size = 500, current_queue_size 499, after summation of 1 and 2 we will add 3 to result queue
            achieving 500.
            With race condition result is 501.
            [1, 2, None, None]
            Thread 1 gets 1.
            Thread 2 gets 2.
            Thread 1 gets None and puts 1 in result queue instead of getting 2 and summing.
            Thread 2 gets None and puts 2 in result queue instead of just getting the first None and returning.
            A lock on both gets removes the race condition.
            &quot;&quot;&quot;
            with self.lock:
                if not self.task_queue.empty():
                    number_1 = self.task_queue.get()
                    self.task_queue.task_done()
                    if number_1 is None:
                        #Poison pill - terminate.
                        print(f&quot;Terminated {self.name}&quot;)
                        return
                else:
                    #Queue empty - terminate.
                    return
                if not self.task_queue.empty():
                    number_2 = self.task_queue.get()
                    self.task_queue.task_done()
                    if number_2 is None:
                        #Cannot compute sum of 1 number so just add number_1 to result_queue and terminate since poison pill
                        #acquired.
                        self.result_queue.put(number_1)
                        print(f&quot;Terminated {self.name}&quot;)
                        return
                else:
                    self.result_queue.put(number_1)
                    #Queue empty, put the 1 number in result queue and terminate.
                    return
            self.result_queue.put(number_1 + number_2)


def multiprocess_sum(array):
    if len(array) == 1:
        return array[0]
    lock = multiprocessing.Lock()
    task_queue = multiprocessing.JoinableQueue()
    [task_queue.put(element) for element in to_sum]
    task_queue_size = len(array)

    while task_queue_size &gt; 1:
        print(task_queue.qsize(), task_queue_size)
        result_queue = multiprocessing.JoinableQueue()
        processes = [CustomProcess(name=str(i), task_queue=task_queue, result_queue=result_queue, lock=lock)
                    for i in range(8)]
        [task_queue.put(None) for process in processes]
        [process.start() for process in processes]
        #[process.join() for process in processes]
        task_queue.join()
        task_queue = result_queue
        task_queue_size = task_queue_size // 2 + task_queue_size % 2
    return result_queue.get()

if __name__ == &quot;__main__&quot;:
    to_sum = [i for i in range(1350)]
    &quot;&quot;&quot;
    If range is below 1200, the program will run and compute everything correctly.
    If it is above it, it will hang at the first halving, the moment the first task_queue is empty and the
    result_queue becomes the new task_queue. 
    Computer restart will make the range values fluctuate, yesterday it would hang at 1177 but run fine up to 1776.
    Queue pipe full???
    &quot;&quot;&quot;
    print(sum(to_sum))
    for i in range(5):
        print(multiprocess_sum(to_sum))

答案1

得分: 2

以下是代码部分的翻译：

这段代码中，首先，对多进程队列上的`empty`和`qsize`调用是不可靠的，不应该使用（请阅读文档）。其次，您正在使用的锁阻止了任何真正的多进程处理，因为在`run`方法中执行的大部分处理都是串行执行的。第三，要将1359个数字相加，需要1349次加法（否则怎么可能呢？）。因此，只需将您的1350个数字均匀分成8个列表，让每个进程对其列表进行求和并返回结果。然后只需将8个返回的值相加即可获得最终结果。

随着`to_sum`的大小增加，使用多进程在总运行时间中的贡献将变得越来越微不足道。

```python
import multiprocessing
from functools import reduce
from operator import add

class CustomProcess(multiprocessing.Process):

    def __init__(self, name, task_queue, result_queue, *args, **kwargs):
        super().__init__(name=name, *args, **kwargs)
        self.task_queue = task_queue
        self.result_queue = result_queue

    def run(self):
        self.result_queue.put(reduce(add, self.task_queue.get(), 0))

def split(iterable, n):  # 将可迭代对象分成n个均匀的部分的函数
    if type(iterable) is range and iterable.step != 1:
        # 算法不适用于步长不为1的情况：
        iterable = list(iterable)
    l = len(iterable)
    n = min(l, n)
    k, m = divmod(l, n)
    return (iterable[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in range(n))

N_PROCESSES = 8

def multiprocess_sum(l):
    if len(l) == 1:
        return l[0]

    task_queue = multiprocessing.Queue()
    lists = split(l, N_PROCESSES)
    for l in lists:
        task_queue.put(l)

    result_queue = multiprocessing.Queue()

    processes = [
        CustomProcess(name=str(i), task_queue=task_queue, result_queue=result_queue)
        for i in range(N_PROCESSES)
    ]
    for process in processes:
        process.start()

    the_sum = reduce(add, (result_queue.get() for _ in range(N_PROCESSES)), 0)

    for process in processes:
        process.join()

    return the_sum

if __name__ == "__main__":
    to_sum = list(range(1350))
    print(sum(to_sum))
    for i in range(5):
        print(multiprocess_sum(to_sum))

打印结果：

我想知道要使用多进程来解决这个特定问题，to_sum列表必须有多大才能获得任何时间节省。

英文:

This is too long for a comment, and so:

First and foremost, calls to empty and qsize on multiprocessing queues are unreliable and should not be used (read the docs). Second, the lock you are using prevents any real multiprocessing from occurring since the bulk of processing done in your run method is being executed serially. Third, to add up 1359 numbers requires 1349 additions (how can it be otherwise?). So simply divide your 1350 numbers into 8 lists as evenly as possible and have each of your processes sum their lists and return the results. Then just add the 8 returned values to get the final result.

As the size of your to_sum grows the more negligible is the contribution to the total running time made by the final addition of the 8 partial sums.

import multiprocessing
from functools import reduce
from operator import add

class CustomProcess(multiprocessing.Process):

    def __init__(self, name, task_queue, result_queue, *args, **kwargs):
        super().__init__(name=name, *args, **kwargs)
        self.task_queue = task_queue
        self.result_queue = result_queue

    def run(self):
        self.result_queue.put(reduce(add, self.task_queue.get(), 0))

def split(iterable, n):  # function to split iterable in n even parts
    if type(iterable) is range and iterable.step != 1:
        # algorithm doesn&#39;t work with steps other than 1:
        iterable = list(iterable)
    l = len(iterable)
    n = min(l, n)
    k, m = divmod(l, n)
    return (iterable[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in range(n))

N_PROCESSES = 8

def multiprocess_sum(l):
    if len(l) == 1:
        return l[0]

    task_queue = multiprocessing.Queue()
    lists = split(l, N_PROCESSES)
    for l in lists:
        task_queue.put(l)

    result_queue = multiprocessing.Queue()

    processes = [
        CustomProcess(name=str(i), task_queue=task_queue, result_queue=result_queue)
        for i in range(N_PROCESSES)
    ]
    for process in processes:
        process.start()

    the_sum = reduce(add, (result_queue.get() for _ in range(N_PROCESSES)), 0)

    for process in processes:
        process.join()

    return the_sum

if __name__ == &quot;__main__&quot;:
    to_sum = list(range(1350))
    print(sum(to_sum))
    for i in range(5):
        print(multiprocess_sum(to_sum))

Prints:

I wonder how large the to_sum list must be to gain any time savings by using multiprocessing for this particular problem.

Update

To do the reduction per the OP's attempt, I would use the following code. Since calls to qsize on a queue are not reliable, we keep track of the number of items on a queue using a sharable integer Value that must be incremented or decremented under control of a lock. We also create the processes once. Read the comments, please.

import multiprocessing

class CustomProcess(multiprocessing.Process):

    def __init__(self,
                 name,
                 task_queue,
                 result_queue,
                 task_queue_size,
                 result_queue_size,
                 condition,
                 *args,
                 **kwargs
                 ):
        super().__init__(name=name, *args, **kwargs)
        self.name = name
        self.task_queue = task_queue
        self.result_queue = result_queue
        self.task_queue_size = task_queue_size
        self.result_queue_size = result_queue_size
        self.condition = condition

    def run(self):
        task_queue = self.task_queue
        result_queue = self.result_queue
        task_queue_size = self.task_queue_size
        result_queue_size = self.result_queue_size
        condition = self.condition

        task_queue_size_lock = task_queue_size.get_lock()

        while True:
            # When the task queue size goes down to zero, the main process
            # will move all the items from the result queue to the
            # task queue and then set the new task queue size.
            # We must not attempt to process the task queue while
            # this occurs, i.e. we must wait until the task queue size
            # is again non-zero:
            with condition:
                while task_queue_size.value == 0:
                    condition.wait()

            # No need to acquire lock for this test:
            if task_queue_size.value &lt; 0:
                return # We are done

            # There is no gurantee we will find anything on the input queue:
            task_queue_size_lock.acquire()
            if task_queue_size.value == 0:
                task_queue_size_lock.release()
                continue

            number_1 = task_queue.get()
            task_queue_size.value -= 1
            if task_queue_size.value == 0:
                # put number on result_queue:
                task_queue_size_lock.release()
                result_queue.put(number_1)
                with result_queue_size.get_lock():
                    result_queue_size.value += 1
                task_queue.task_done()
            else:
                number_2 = task_queue.get()
                task_queue_size.value -= 1
                task_queue_size_lock.release()
                # No lock is held for the actual reduction operation:
                result_queue.put(number_1 + number_2)
                with result_queue_size.get_lock():
                    result_queue_size.value += 1
                # Since we have tasken off 2 elements from the task queue:
                task_queue.task_done()
                task_queue.task_done()

def multiprocess_sum(array):
    n = len(array)
    if n == 1:
        return array[0]

    task_queue = multiprocessing.JoinableQueue()
    # You should be iterating array, not to_sun and
    # using a comprehension for its side effect is not Pythonic:
    for element in array:
        task_queue.put(element)
    task_queue_size = multiprocessing.Value(&#39;i&#39;, n)

    result_queue = multiprocessing.Queue()
    result_queue_size = multiprocessing.Value(&#39;i&#39;, 0)

    condition = multiprocessing.Condition()

    processes = [
        CustomProcess(name=str(i),
                      task_queue=task_queue,
                      result_queue=result_queue,
                      task_queue_size=task_queue_size,
                      result_queue_size=result_queue_size,
                      condition=condition,
                      )
        for i in range(8)
    ]
    for process in processes:
        process.start()

    while True:
        print(&#39;n =&#39;, n)

        # Wait for task_queue to be emptied:
        task_queue.join()

        # Now we can be sure that the child processes are no longer retrieving from the task queue
        # and putting items on the result queue:

        n = result_queue_size.value
        if n == 1:
            print(&#39;n =&#39;, n)
            result = result_queue.get()
            # Tell child processes to terminate:
            with condition:
                task_queue_size.value = -1
                condition.notify_all()
            return result

        # Child processes get their input from task queue.
        # So we must get items from the result queue and move them to the task queue.
        # The task queue size is now 0 so this should pause the child processes.
        for _ in range(n):
            task_queue.put(result_queue.get())
        result_queue_size.value = 0
        # Allow children to run
        with condition:
            # The new n value will be half of the previous n value (more or less)
            task_queue_size.value = n
            condition.notify_all()

if __name__ == &quot;__main__&quot;:
    to_sum = [i for i in range(1350)]
    print(sum(to_sum))


    for i in range(5):
        print(multiprocess_sum(to_sum))

Prints:

910575
task queue size: 1350
task queue size: 675
task queue size: 338
task queue size: 169
task queue size: 85
task queue size: 43
task queue size: 22
task queue size: 11
task queue size: 6
task queue size: 3
task queue size: 2
task queue size: 1
910575
task queue size: 1350
task queue size: 675
task queue size: 338
task queue size: 169
task queue size: 85
task queue size: 43
task queue size: 22
task queue size: 11
task queue size: 6
task queue size: 3
task queue size: 2
task queue size: 1
910575
task queue size: 1350
task queue size: 675
etc.

Update 2

But simpler yet is to use a multiprocessing pool and to pass to the worker function pairs of items to be reduced:

import multiprocessing
from itertools import chain

def custom_process(pair):
    return pair[0] + pair[1] if len(pair) == 2 else pair[0]

def multiprocess_sum(array):
    n = len(array)
    if n == 1:
        return array[0]

    results = array
    pool = multiprocessing.Pool(8)
    while True:
        print(&#39;n =&#39;, n)
        if n == 1:
            break

        # Create pairs:
        pairs = zip(results[:n//2], results[n//2:])
        # Did we start with an odd number of elements?
        if n % 2 == 1:
            pairs = chain(pairs, ((results[-1],),))
        # specify a chunksize for improved performance:
        results = list(pool.imap_unordered(custom_process, pairs))
        n = len(results)
    result = results[0]
    pool.close()
    pool.join()
    return result

if __name__ == &quot;__main__&quot;:
    to_sum = [i for i in range(1350)]
    print(sum(to_sum))


    for i in range(5):
        print(multiprocess_sum(to_sum))

Prints:

910575
n = 1350
n = 675
n = 338
n = 169
n = 85
n = 43
n = 22
n = 11
n = 6
n = 3
n = 2
n = 1
910575
n = 1350
n = 675

答案2

得分: 0

以下是翻译好的部分：

使用管理器解决了这个问题。

import multiprocessing

class CustomProcess(multiprocessing.Process):
    def __init__(self, name, task_queue, result_queue, lock, chunks=2, *args, **kwargs):
        super().__init__(name=name, *args, **kwargs)
        self.name = name
        self.task_queue = task_queue
        self.result_queue = result_queue
        self.chunks = chunks
        self.lock = lock

    def run(self):
        while True:
            with self.lock:
                if not self.task_queue.empty():
                    number_1 = self.task_queue.get()
                    self.task_queue.task_done()
                    if number_1 is None:
                        # 毒丸 - 终止。
                        print(f"终止 {self.name}")
                        return
                else:
                    # 队列为空 - 终止。
                    return
                if not self.task_queue.empty():
                    number_2 = self.task_queue.get()
                    self.task_queue.task_done()
                    if number_2 is None:
                        # 无法计算1个数字的和，因此只需将number_1添加到result_queue并终止，因为已获取毒丸。
                        self.result_queue.put(number_1)
                        print(f"终止 {self.name}")
                        return
                else:
                    self.result_queue.put(number_1)
                    # 队列为空，将第一个数字放入result队列并终止。
                    return
            self.result_queue.put(number_1 + number_2)

def multiprocess_sum(array):
    if len(array) == 1:
        return array[0]
    lock = multiprocessing.Lock()

    with multiprocessing.Manager() as manager:
        task_queue = manager.Queue()
        [task_queue.put(element) for element in to_sum]
        task_queue_size = len(array)

        while task_queue_size > 1:
            result_queue = manager.Queue()

            processes = [CustomProcess(name=str(i), task_queue=task_queue, result_queue=result_queue, lock=lock)
                        for i in range(8)]
            [task_queue.put(None) for process in processes]
            [process.start() for process in processes]
            #[process.join() for process in processes]
            task_queue.join()
            task_queue = result_queue
            task_queue_size = task_queue_size // 2 + task_queue_size % 2
        return result_queue.get()

if __name__ == "__main__":
    to_sum = [i for i in range(2000)]

    print(sum(to_sum))
    for i in range(5):
        print(multiprocess_sum(to_sum))

希望这可以帮助您理解代码的含义。如果有其他问题，请随时提出。

英文:

Using a manager solves the problem.

import multiprocessing
&quot;&quot;&quot;
Using a manager avoids the problem where the program will hang if the input is too big.
&quot;&quot;&quot;
class CustomProcess(multiprocessing.Process):
def __init__(self, name, task_queue, result_queue, lock, chunks=2, *args, **kwargs):
super().__init__(name=name, *args, **kwargs)
self.name = name
self.task_queue = task_queue
self.result_queue = result_queue
self.chunks = chunks
self.lock = lock
def run(self):
while True:
&quot;&quot;&quot;
Using a lock to avoid a race condition where 2 threads both get a number, then
both to get another, but it is None so they both put the number back resulting in a result queue
with bigger size. For example:
Expected result_queue_size = 500, current_queue_size 499, after summation of 1 and 2 we will add 3 to result queue
achieving 500.
With race condition result is 501.
[1, 2, None, None]
Thread 1 gets 1.
Thread 2 gets 2.
Thread 1 gets None and puts 1 in result queue instead of getting 2 and summing.
Thread 2 gets None and puts 2 in result queue instead of just getting the first None and returning.
A lock on both gets removes the race condition.
&quot;&quot;&quot;
with self.lock:
if not self.task_queue.empty():
number_1 = self.task_queue.get()
self.task_queue.task_done()
if number_1 is None:
#Poison pill - terminate.
print(f&quot;Terminated {self.name}&quot;)
return
else:
#Queue empty - terminate.
return
if not self.task_queue.empty():
number_2 = self.task_queue.get()
self.task_queue.task_done()
if number_2 is None:
#Cannot compute sum of 1 number so just add number_1 to result_queue and terminate since poison pill
#acquired.
self.result_queue.put(number_1)
print(f&quot;Terminated {self.name}&quot;)
return
else:
self.result_queue.put(number_1)
#Queue empty, put the 1 number in result queue and terminate.
return
self.result_queue.put(number_1 + number_2)
def multiprocess_sum(array):
if len(array) == 1:
return array[0]
lock = multiprocessing.Lock()
with multiprocessing.Manager() as manager:
task_queue = manager.Queue()
[task_queue.put(element) for element in to_sum]
task_queue_size = len(array)
while task_queue_size &gt; 1:
result_queue = manager.Queue()
processes = [CustomProcess(name=str(i), task_queue=task_queue, result_queue=result_queue, lock=lock)
for i in range(8)]
[task_queue.put(None) for process in processes]
[process.start() for process in processes]
#[process.join() for process in processes]
task_queue.join()
task_queue = result_queue
task_queue_size = task_queue_size // 2 + task_queue_size % 2
return result_queue.get()
if __name__ == &quot;__main__&quot;:
to_sum = [i for i in range(2000)]
print(sum(to_sum))
for i in range(5):
print(multiprocess_sum(to_sum))

I must agree with user Booboo that this specific implementation is not the best (might be the worst, actually) for a performant parallel sum function. If you are looking for a parallel sum function, read his answer. If you are interested in why your program hangs when faced with big Queues, using a Manager should solve your problem. Should you have an implementation of a parallel reduce function (better handling of the poison pill race condition), please do share with the intent of bettering the provided implementation for those looking for that specifically.
Best regards,
Tary

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Multiprocessed reduce function hangs when task queue size of above 1200 elements in Python.

问题

答案1

答案2

如何在Python中注释用户定义的集合？

坚持使用PHP还是学习Go语言？

我需要关于情感分析和机器学习的建议。

是一个简单的Python列表比Redis更快地推送数据吗？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论