Python subprocess (using Popen) hangs indefinitely and doesn't terminate when the script ends. Why, and how to fix?

huangapple go评论62阅读模式
英文:

Python subprocess (using Popen) hangs indefinitely and doesn't terminate when the script ends. Why, and how to fix?

问题

当我运行父脚本而不将其分叉到子进程中时,它在结束时正常终止。当我单独运行子进程脚本时,它也在结束时正常终止。

然而,当我在父脚本中作为子进程运行子进程脚本时,而不是在结束时终止,它会无限期地挂起。有什么想法为什么会发生这种情况?

以下是父脚本的代码,其中包含在move_current_image函数中执行的子进程代码:

import os
from PIL import Image, ImageTk
import tkinter as tk
from tkinter import messagebox
import subprocess

# ... 省略了一些代码

def move_current_image():
    global files_index
    img_path = os.path.join(image_folder, files[files_index - 1])
    new_path = os.path.join(requires_editing_folder, files[files_index - 1])
    os.rename(img_path, new_path)

    subprocess.Popen(["python", "pixelbin-image-transformation-v3.py", files[files_index - 1], requires_editing_folder],
                     start_new_session=True)

    files.pop(files_index - 1)
    files_index -= 1
    load_next_image()

# ... 省略了一些代码

load_next_image() # 开始工作流程
root.mainloop() # 启动 GUI 主循环

以下是子进程脚本的代码:

import sys
import os

# ... 省略了一些代码

url = f"https://cdn.pixelbin.io/v2/my_api_key_goes_here/wm.remove(rem_text:true)/IMAGE_EDITING_FOLDER/{FilenameModifiedByPixelbin}"
fetch_image(url)

请问可能是什么原因导致这种情况发生?谢谢!

英文:

When I run the parent script without forking into the subprocess, it terminates fine at the end. When I run the subprocess script by itself, it terminates just fine at the end.

However when I run the subprocess script inside of the parent script as a subprocess as needed, instead of terminating at the end, it just hangs indefinitely. Any ideas why this happens?

Here's the parent script code -- with the subprocess execution code being in the move_current_image function:

import os
from PIL import Image, ImageTk
import tkinter as tk
from tkinter import messagebox
import subprocess

image_folder = r'C:\Users\anton\Pictures\image_editing\CURRENT IMAGE FOLDER' # the folder path of the images to curate
requires_editing_folder = r'C:\Users\anton\Pictures\image_editing\REQUIRES IMAGE EDITING'

files = os.listdir(image_folder)
files_index = 0

root = tk.Tk() # creates a simple GUI window
root.geometry('800x600')

canvas = tk.Canvas(root, width=800, height=600)
canvas.pack(side="left", fill="both", expand=True)

scrollbar_v = tk.Scrollbar(root, orient="vertical", command=canvas.yview)
scrollbar_v.pack(side="right", fill="y")
scrollbar_h = tk.Scrollbar(root, orient="horizontal", command=canvas.xview)
scrollbar_h.pack(side="bottom", fill="x")

canvas.configure(yscrollcommand=scrollbar_v.set, xscrollcommand=scrollbar_h.set)

frame = tk.Frame(canvas)
canvas.create_window((0,0), window=frame, anchor='nw')

label = tk.Label(frame)
label.pack()

zoom_state = False
img = None

# Functions for mousewheel scrolling
def scroll_v(event):
    canvas.yview_scroll(int(-1*(event.delta)), "units")
canvas.bind_all("<MouseWheel>", scroll_v)

def scroll_h(event):
    canvas.xview_scroll(int(-1*(event.delta)), "units")
canvas.bind_all("<Shift-MouseWheel>", scroll_h)

# Functions for keyboard scrolling
def scroll_up(event):
    canvas.yview_scroll(-5, "units")

def scroll_down(event):
    canvas.yview_scroll(5, "units")

def scroll_left(event):
    canvas.xview_scroll(-5, "units")

def scroll_right(event):
    canvas.xview_scroll(5, "units")

def load_next_image(): 
    global img
    global files_index
    if files_index >= len(files):
        messagebox.showinfo("Information", "No more images left")
        root.quit()
    else:
        img_path = os.path.join(image_folder, files[files_index])
        img = Image.open(img_path)
        update_image_display()
        files_index += 1

def delete_current_image(): # deletes the current image and loads the next one
    global files_index
    img_path = os.path.join(image_folder, files[files_index - 1])
    os.remove(img_path)
    files.pop(files_index - 1)
    files_index -= 1
    load_next_image()

def update_image_display(): # toggles zoom in, zoom out
    global img
    if zoom_state:
        tmp = img.resize((img.width, img.height)) # 100% size
    else:
        tmp = img.resize((int(img.width*0.4), int(img.height*0.4))) # 40% size
    photo = ImageTk.PhotoImage(tmp)
    label.config(image=photo)
    label.image = photo

    # Updating scroll region after image is loaded
    root.update()
    canvas.configure(scrollregion=canvas.bbox("all"))

def toggle_zoom(event): # toggles zoom state pt 2
    global zoom_state
    zoom_state = not zoom_state
    update_image_display()

def move_current_image():
    global files_index
    img_path = os.path.join(image_folder, files[files_index - 1])
    new_path = os.path.join(requires_editing_folder, files[files_index - 1])
    os.rename(img_path, new_path)

    subprocess.Popen(["python", "pixelbin-image-transformation-v3.py", files[files_index - 1], requires_editing_folder], 
                     start_new_session=True)

    files.pop(files_index - 1)
    files_index -= 1
    load_next_image()

# Bind the keys to the desired events
root.bind('<Right>', lambda event: load_next_image()) # approves this image, moves to next
root.bind('<Left>', lambda event: delete_current_image()) # deletes this image, as it doesn't meet the criteria
root.bind('<Up>', lambda event: move_current_image()) # moves this image into "requires photoshop editing" holding bay folder
root.bind('e', toggle_zoom) # bounces back between 100% and 40% zoom

# keyboard-based scrolling, for maximum efficiency
root.bind('f', scroll_down)
root.bind('r', scroll_up)
root.bind('d', scroll_left)
root.bind('g', scroll_right)

load_next_image() # start the workflow
root.mainloop() # start the GUI main loop

Here's the subprocess script code:

import sys
import os

# STEP 1 = UPLOAD IMAGE FILE TO PIXELBIN
# DOCUMENTATION FOR API-BASED FILE UPLOADING: https://github.com/pixelbin-dev/pixelbin-python-sdk/blob/main/documentation/platform/ASSETS.md#fileupload

CurrentImageFilename = sys.argv[1] # this will be files[files_index - 1] from the parent script, which gets passed into this script as a variable
CurrentImageFolder = sys.argv[2] # this will be image_folder from the parent script, because this is run as a subprocess
CurrentFilepathFull = os.path.join(CurrentImageFolder, CurrentImageFilename)
FinalImageDownloadFolder = r"C:\Users\anton\Pictures\image_editing\IMAGE EDITING COMPLETED"

import asyncio
from pixelbin import PixelbinClient, PixelbinConfig

config = PixelbinConfig({
    "domain": "https://api.pixelbin.io",
    "apiSecret": "my_api_key_goes_here",
})

pixelbin:PixelbinClient = PixelbinClient(config=config)

try:
    print("Pixelbin image upload in progress...")
    result = pixelbin.assets.fileUpload( # sync method call (there's also an async method, in the above documentation)
        file=open(CurrentFilepathFull, "rb"),
        path="IMAGE_EDITING_FOLDER",
        name=CurrentImageFilename, # uses the unique filename each time since I may be running transform operations in parallel. doing so on identical filenames, before deleting the previous, would absolutely cause timing problems.
        access="public-read",
        tags=["tag1","tag2"],
        metadata={},
        overwrite=True,
        filenameOverride=True)
    print("Image uploaded successfully!")
    # print(result)
except Exception as e:
    print(e)

# Pixelbin automatically modifies filenames like so. I therefore follow their convention so I can later delete the correctly-formatted filename.
FilenameModifiedByPixelbin = CurrentImageFilename.replace('.', '_') 

# STEP 2 = RUN TRANSFORMATION ON SPECIFIED IMAGE
# DOCUMENTATION FOR URL-BASED TRANSFORMATIONS: https://www.pixelbin.io/docs/url-structure/#transforming-images-using-url, https://www.pixelbin.io/docs/transformations/ml/watermark-remover/

# this basically auto-runs the transformation if you submit the properly formatted URL;
# instead of literally navigating to the URL, this executes the same functionality via background HTTP GET requests
# in this case, the cloud-name is functionally like the API key that allows it to be authenticated

def auto_delete_original_image_from_storage(): 
    # STEP 3 = AUTO-DELETE ORIGINAL IMAGE FROM PIXELBIN STORAGE, POST-TRANSFORMATION + DOWNLOAD
    # DOCUMENTATION FOR API-BASED FILE DELETION: https://github.com/pixelbin-dev/pixelbin-python-sdk/blob/main/documentation/platform/ASSETS.md#deletefile
    
    print("Pixelbin image deletion from storage in progress...")
    
    try: # I don't redefine the config stuff / permissions here, because that was done earlier during fileUpload
        result = pixelbin.assets.deleteFile( # sync method call
            fileId=f"IMAGE_EDITING_FOLDER/{FilenameModifiedByPixelbin}")
        print("Image deleted from Pixelbin storage!")
        # print(result)
    except Exception as e:
        print(e)

import requests
import time

def fetch_image(url):
    print("Pixelbin image transformation in progress...")
    response = requests.get(url)

    # continues functionally "refreshing" the CDN URL until the transformation is complete and the image is ready for download
    # note that this does NOT continue to fire additional identical jobs + use up credits each time
    # that's because once the exact specific job is sent via the specific "send job" URL? it simply runs until completed
    # and you'll get a 202 code that simply says: "that job is currently in progress, please wait..."
    # by "rerunning" it after fully completed too? it'll just auto-download the already-completed image, not re-run the original job
    while response.status_code == 202: # status code 202 = transformation still in progress;
        print("Transformation still processing. Waiting for 5 seconds before trying again.")
        time.sleep(5)
        response = requests.get(url)

    if 'TransformationJobError' in response.text:
        print("There was an error with the transformation.")
        return None

    if response.status_code == 200: # status code 200 = successful image transformation, ready for download;
        filepath = os.path.join(FinalImageDownloadFolder, CurrentImageFilename)
        with open(filepath, 'wb') as f:
            f.write(response.content)
        print("Image downloaded successfully!")
        auto_delete_original_image_from_storage()
        return True

    print(f"Unexpected status code: {response.status_code}")
    return None

url = f"https://cdn.pixelbin.io/v2/my_api_key_goes_here/wm.remove(rem_text:true)/IMAGE_EDITING_FOLDER/{FilenameModifiedByPixelbin}"
fetch_image(url)

Any ideas what could be causing this? Thanks!

答案1

得分: 1

我还在 Reddit 上找到了一个类似的帖子:

https://www.reddit.com/r/Python/comments/1vbie0/subprocesspipe_will_hang_indefinitely_if_stdout/

"Subprocess.PIPE 如果标准输出超过 65000 个字符将会无限期挂起。"

"问题出现在 communicate 函数中。当子进程通过输出管道传递过多数据时,它会静默失败。当 communicate 停止从管道的父进程端读取时,管道会被填满,导致子进程端的写入被阻塞。"

这可能是问题的原因吗?虽然我没有明确尝试在子进程和父进程脚本之间进行通信,但子进程确实在下载图像文件(在这种情况下可能相当大),也许这同样超过了某种内存限制,导致它像上面的示例一样无限期挂起?

英文:

I did also find this Reddit thread on a similar issue:

https://www.reddit.com/r/Python/comments/1vbie0/subprocesspipe_will_hang_indefinitely_if_stdout/

"Subprocess.PIPE will hang indefinitely if stdout is more than 65000 characters."

"The problem is in the communicate function. It silently fails when the subprocess passes too much data through either the output or error pipes. When communicate stops reading from the parent end of the pipe, the pipe fills and blocks writes at the child end."

Could this be the issue? I don't explicitly try to communicate back and forth between the subprocess and parent script, however the subprocess does download an image file (which would be quite large in this case), and perhaps that similarly exceeds some kind of memory limit that then causes it to hang indefinitely like in the above example?

huangapple
  • 本文由 发表于 2023年6月29日 02:25:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/76575794.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定