2023年3月21日 01:47:30go评论59阅读模式

英文:

Function return value of wrong type if I use multiprocessing

问题

Here is the translated code:

世界！
我有这段代码：

```python
from numpy import array
from cv2 import imshow, cvtColor, imwrite, imread, destroyAllWindows, COLOR_BGR2RGB
from pyscreenshot import grab
import pytesseract

filename = 'image.png'
elements_for_replace = {'iL': '1L', 'Bi': 'B1', 'Bl': 'B1', 'Ci': 'C1', 'Cl': 'C1'}

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\Administrator\AppData\Local\Tesseract-OCR\tesseract.exe'

def scanning(x1, y1, x2, y2):
    screen = array(grab(bbox=(x1, y1, x2, y2)))
    imwrite(filename, screen)
    img = imread(filename)
    text = pytesseract.image_to_string(img)
    history = text.split()
    return history

def first():
    return scanning(730, 740, 1335, 790)

def second():
    return scanning(730, 453, 1335, 500)

def third():
    return scanning(817, 45, 1522, 99)

def replace_elements(data, replace_data):
    for item in data:
        if item in replace_data:
            data[data.index(item)] = replace_data[item]
    return data

def get_data():
    x = replace_elements(first(), elements_for_replace)
    y = replace_elements(second(), elements_for_replace)
    z = replace_elements(third(), elements_for_replace)
    destroyAllWindows()
    return x, y, z

当调用get_data()函数时，此代码使用计算机视觉将图像转换为文本，位于屏幕上的三个不同位置。然后替换失败的元素。最终，我们得到一个元组的列表(x, y, z)，将由程序的另一部分处理。

将图像转换为文本需要很多时间。程序的顺序执行方式将这个时间乘以3。我得出结论，需要使用多进程模块（或者更确切地说是concurrent.futures）来减少程序的执行时间。

我将get_data()函数重写如下：

import concurrent.futures

def get_data():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        x = executor.submit(replace_elements, first(), elements_for_replace)
        y = executor.submit(replace_elements, second(), elements_for_replace)
        z = executor.submit(replace_elements, third(), elements_for_replace)
    destroyAllWindows()
    return x, y, z

现在返回的变量的数据类型是<class 'concurrent.futures._base.Future'>，而不是<class 'list'>，尝试处理此数据的程序会引发错误"TypeError: 'Future' object is not subscriptable"。

要以与代码的第一个版本相同的方式启动函数的并行执行，从而返回值仍然是<class 'list'>，您可以使用.result() 方法来获取Future 对象的实际结果。例如：

def get_data():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        x = executor.submit(replace_elements, first(), elements_for_replace).result()
        y = executor.submit(replace_elements, second(), elements_for_replace).result()
        z = executor.submit(replace_elements, third(), elements_for_replace).result()
    destroyAllWindows()
    return x, y, z

这将等待每个Future 对象的结果并将其转换为列表类型。

英文:

world!
I have this code:

from numpy import array
from cv2 import imshow, cvtColor, imwrite, imread, destroyAllWindows, COLOR_BGR2RGB
from pyscreenshot import grab
import pytesseract


filename = &#39;image.png&#39;
elements_for_replace = {&#39;iL&#39;: &#39;1L&#39;, &#39;Bi&#39;: &#39;B1&#39;, &#39;Bl&#39;: &#39;B1&#39;, &#39;Ci&#39;: &#39;C1&#39;, &#39;Cl&#39;: &#39;C1&#39;}

pytesseract.pytesseract.tesseract_cmd = r&#39;C:\Users\Administrator\AppData\Local\Tesseract-OCR\tesseract.exe&#39;


def scanning(x1, y1, x2, y2):
    screen = array(grab(bbox=(x1, y1, x2, y2)))
    imwrite(filename, screen)
    img = imread(filename)
    text = pytesseract.image_to_string(img)
    history = text.split()
    return history


def first():
    return scanning(730, 740, 1335, 790)


def second():
    return scanning(730, 453, 1335, 500)


def third():
    return scanning(817, 45, 1522, 99)


def replace_elements(data, replace_data):
    for item in data:
        if item in replace_data:
            data[data.index(item)] = replace_data[item]
    return data


def get_data():
    x = replace_elements(first(), elements_for_replace)
    y = replace_elements(second(), elements_for_replace)
    z = replace_elements(third(), elements_for_replace)
    destroyAllWindows()
    return x, y, z

When the function get_data() is called, this code uses computer vision to translate an image into text at three different locations on the screen. Does it consistently. It then replaces the failed elements with the correct ones. At the output, we get a tuple of lists (x, y, z), which will be processed by another part of the program.

Converting images to text takes a lot of time. And the sequential execution of the program multiplies this time by 3. I came to the conclusion that I need to use the multiprocessing module (or rather concurrent.futures) to reduce the program execution time.

I rewrote the function get_data() like this:

import concurrent.futures


def get_data():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        x = executor.submit(replace_elements, first(), elements_for_replace)
        y = executor.submit(replace_elements, second(), elements_for_replace)
        z = executor.submit(replace_elements, third(), elements_for_replace)
    destroyAllWindows()
    return x, y, z

Now the returned variables have data type <class 'concurrent.futures._base.Future'> instead of <class 'list'> and the program, trying to process this data, throws an error 'TypeError: 'Future' object is not subscriptable'.

How to start parallel execution of a function so that the return value of the function is the same as in the first version of the code, that is <class 'list'> ???

答案1

得分: 2

executor.submit() 返回一个Future对象，而不是被调用函数的返回值。为了获取函数返回的值，你必须在Future对象上调用result()。在你的情况下，你需要修改你的代码如下：

def get_data():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        future_x = executor.submit(replace_elements, first(), elements_for_replace)
        future_y = executor.submit(replace_elements, second(), elements_for_replace)
        future_z = executor.submit(replace_elements, third(), elements_for_replace)
    destroyAllWindows()

    # 实际上获取函数的返回值
    x = future_x.result()
    y = future_y.result()
    z = future_z.result()

    return x, y, z

另外，作为建议，你也可以考虑使用ProcessPoolExecutor.map()来简化代码。使用它，你不需要定义每个结果。

英文:

executor.submit() returns a Future object, not the value of the function called. In order to get the value returned by the function, you must call result() on the Future object. In your case you'll want to modify your code like so:

def get_data():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        future_x = executor.submit(replace_elements, first(), elements_for_replace)
        future_y = executor.submit(replace_elements, second(), elements_for_replace)
        future_z = executor.submit(replace_elements, third(), elements_for_replace)
    destroyAllWindows()

    # Actually get the value of the function here
    x = future_x.result()
    y = future_y.result()
    z = future_z.result()

    return x, y, z

Additionally, as a suggestion you could also look into using the ProcessPoolExecutor.map() to clean up the code a bit. With it, you wouldn't have to define each result.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用多进程时，如果函数返回值的类型错误，会出现问题。

问题

答案1

GAE实例是否限制为10个并发请求？

在列中筛选包含子字符串的pandas数据框。

将整数转换为给定间隔的二进制的Python函数

获取特定模型的request.user和特定用户的记录该如何操作？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论