英文:
Python Multiprocessing with tqdm: Progress Bar Not Updating
问题
我使用Python的multiprocessing模块来加速从3D激光雷达数据计算特征。每个进程计算数据中一部分点的特征。我使用tqdm来提供计算进度条,但进度条没有按预期更新。
这是我使用的代码:
import numpy as np
from multiprocessing import Pool
from tqdm import tqdm
from functools import partial
# np.random.seed(0)
lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
def compute_feature(i, lidar_data, radius):
x_center, y_center, z_center = lidar_data[i]
height = z_center
bounding_box_x_min = x_center - radius - 5
bounding_box_x_max = x_center + radius + 5
bounding_box_y_min = y_center - radius - 5
bounding_box_y_max = y_center + radius + 5
points_in_cylinder = []
z_height = []
for point in lidar_data:
x, y, z = point
if bounding_box_x_min <= x <= bounding_box_x_max and bounding_box_y_min <= y <= bounding_box_y_max:
if z <= z_center:
dist_to_center = np.sqrt((x - x_center)**2 + (y - y_center)**2)
if dist_to_center <= radius and z_center - height <= z:
points_in_cylinder.append(point)
z_height.append(z)
points_in_cylinder = np.array(points_in_cylinder)
z = [round(float(point[2]), 3) for point in points_in_cylinder]
minimum_z = min(z) if points_in_cylinder.size > 0 else z_center
feature = z_center - 2 * minimum_z
return feature
# lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
radius = 3.0
with Pool() as pool:
func = partial(compute_feature, lidar_data=lidar_data, radius=radius)
features = list(tqdm(pool.imap(func, range(lidar_data.shape[0])), total=lidar_data.shape[0], desc="Computing feature"))
features = np.array(features)
当我运行这段代码时,tqdm进度条出现了,但没有更新。看起来就像计算根本没有开始。可能是什么原因导致了这个问题?我该如何使进度条正确更新?
英文:
I'm using Python's multiprocessing module to speed up the computation of a feature from 3D LIDAR data. Each process computes the feature for a subset of the points in the data. I'm using tqdm to provide a progress bar for the computation, but the bar isn't updating as expected.
Here's the code I'm using:
import numpy as np
from multiprocessing import Pool
from tqdm import tqdm
from functools import partial
# np.random.seed(0)
lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
def compute_feature(i, lidar_data, radius):
x_center, y_center, z_center = lidar_data[i]
height = z_center
bounding_box_x_min = x_center - radius - 5
bounding_box_x_max = x_center + radius + 5
bounding_box_y_min = y_center - radius - 5
bounding_box_y_max = y_center + radius + 5
points_in_cylinder = []
z_height = []
for point in lidar_data:
x, y, z = point
if bounding_box_x_min <= x <= bounding_box_x_max and bounding_box_y_min <= y <= bounding_box_y_max:
if z <= z_center:
dist_to_center = np.sqrt((x - x_center)**2 + (y - y_center)**2)
if dist_to_center <= radius and z_center - height <= z:
points_in_cylinder.append(point)
z_height.append(z)
points_in_cylinder = np.array(points_in_cylinder)
z = [round(float(point[2]), 3) for point in points_in_cylinder]
minimum_z = min(z) if points_in_cylinder.size > 0 else z_center
feature = z_center - 2 * minimum_z
return feature
# lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
radius = 3.0
with Pool() as pool:
func = partial(compute_feature, lidar_data=lidar_data, radius=radius)
features = list(tqdm(pool.imap(func, range(lidar_data.shape[0])), total= lidar_data.shape[0], desc="Computing feature"))
features = np.array(features)
When I run this code, the tqdm progress bar appears, but it doesn't update. It's as if the computation isn't starting at all. What could be the cause of this issue? How can I get the progress bar to update correctly?
答案1
得分: 0
我刚刚尝试复制您的代码,问题不在于tqdm,而在于您使用multiprocessing.Pool()
的方式。
当我照原样复制您的代码时,我得到了一连串无休止的错误,告诉我在第一个进程没有完全初始化时尝试启动一个新进程,这是因为您在加载文件时启动了进程池,而不是在main()
函数中这样做。
这个代码可以正常工作:
import numpy as np
from multiprocessing import Pool
from tqdm import tqdm
from functools import partial
def compute_feature(i, lidar_data, radius):
x_center, y_center, z_center = lidar_data[i]
height = z_center
bounding_box_x_min = x_center - radius - 5
bounding_box_x_max = x_center + radius + 5
bounding_box_y_min = y_center - radius - 5
bounding_box_y_max = y_center + radius + 5
points_in_cylinder = []
z_height = []
for point in lidar_data:
x, y, z = point
if bounding_box_x_min <= x <= bounding_box_x_max and bounding_box_y_min <= y <= bounding_box_y_max:
if z <= z_center:
dist_to_center = np.sqrt((x - x_center)**2 + (y - y_center)**2)
if dist_to_center <= radius and z_center - height <= z:
points_in_cylinder.append(point)
z_height.append(z)
points_in_cylinder = np.array(points_in_cylinder)
z = [round(float(point[2]), 3) for point in points_in_cylinder]
minimum_z = min(z) if points_in_cylinder.size > 0 else z_center
feature = z_center - 2 * minimum_z
return feature
def main():
# np.random.seed(0)
lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
radius = 3.0
with Pool() as pool:
func = partial(compute_feature, lidar_data=lidar_data, radius=radius)
features = list(tqdm(pool.imap(func, range(lidar_data.shape[0])), total=lidar_data.shape[0], desc="Computing feature"))
features = np.array(features)
print(features)
if __name__ == "__main__":
main()
我唯一需要做的更改是将所有您的“main”代码移到一个函数中,并添加一个主保护来仅从主进程调用它,而不是从池创建的分叉进程中调用它。
英文:
I have just tried to reproduce your code, and the problem is not with tqdm, it's the way you're using multiprocessing.Pool()
.
When I copied your code verbatim, I got an endless loop of errors telling me something about starting a new process when the first one was not fully initialized, and that's due to the fact that you launch the pool as you load the file instead of doing so in a main()
function.
This code works:
import numpy as np
from multiprocessing import Pool
from tqdm import tqdm
from functools import partial
def compute_feature(i, lidar_data, radius):
x_center, y_center, z_center = lidar_data[i]
height = z_center
bounding_box_x_min = x_center - radius - 5
bounding_box_x_max = x_center + radius + 5
bounding_box_y_min = y_center - radius - 5
bounding_box_y_max = y_center + radius + 5
points_in_cylinder = []
z_height = []
for point in lidar_data:
x, y, z = point
if bounding_box_x_min <= x <= bounding_box_x_max and bounding_box_y_min <= y <= bounding_box_y_max:
if z <= z_center:
dist_to_center = np.sqrt((x - x_center)**2 + (y - y_center)**2)
if dist_to_center <= radius and z_center - height <= z:
points_in_cylinder.append(point)
z_height.append(z)
points_in_cylinder = np.array(points_in_cylinder)
z = [round(float(point[2]), 3) for point in points_in_cylinder]
minimum_z = min(z) if points_in_cylinder.size > 0 else z_center
feature = z_center - 2 * minimum_z
return feature
def main():
# np.random.seed(0)
lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
# lidar_data = np.random.uniform(low=0.0, high=100.0, size=(1000, 3))
radius = 3.0
with Pool() as pool:
func = partial(compute_feature, lidar_data=lidar_data, radius=radius)
features = list(tqdm(pool.imap(func, range(lidar_data.shape[0])), total= lidar_data.shape[0], desc="Computing feature"))
features = np.array(features)
print(features)
if __name__ == "__main__":
main()
The only change I had to make was move all your "main" code into a function, and add a main guard to call it only once, from the primary process, and not from the forked processes that the pool creates.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论