英文:
Analyzing execution of a Python program from another Python program
问题
我想编写一个Python程序,分析其他任意的Python程序的执行。
例如,假设我有一个名为main.py
的Python脚本,它调用一个名为func
的函数若干次。我想创建另一个名为analyzer.py
的脚本,可以在main.py
运行时“查看”它,并记录func
被调用的次数。我还想记录传递给func
的输入参数列表,以及每次调用时func
的返回值。
我不能以任何方式修改main.py
或func
的源代码。理想情况下,analyzer.py
应该适用于任何Python程序和任何函数。
我发现实现这一目标的最佳方法是让analyzer.py
以子进程的方式运行main.py
,并使用pdb进行调试。
script = "main.py"
process = subprocess.Popen(['python', '-m', 'pdb', script], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
然后,我可以通过进程的stdin发送pdb命令给程序,然后通过stdout读取输出。
要检索func
的输入参数和返回值,我需要:
- 通过分析其文件找到
func
的第一行的行号 - 发送该文件/行号的断点命令
- 发送继续命令
- 导入pickle,将
locals()
序列化,并打印到stdout(以获取输入参数) - 发送返回命令(转到函数的末尾)
- 序列化
__return__
并打印到stdout - 发送继续命令
我想知道是否有更好的方法来实现这个目标。
英文:
I want to write a Python program that analyzes the execution of other arbitrary Python programs.
For example, suppose I have a Python script called main.py
that calls a function func
a certain number of times. I want to create another script called analyzer.py
that can "look inside" main.py
while it's running and record how many times func
was called. I also want to record the list of input arguments passed to func
, and the return value of func
each time it was called.
I cannot modify the source code of main.py
or func
in any way. Ideally analyzer.py
would work for any python program, and for any function.
The best way I have found to accomplish this is to have analyzer.py
run main.py
as a subprocess using pdb.
script = "main.py"
process = subprocess.Popen(['python', '-m', 'pdb', script], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
I can then send pdb commands to the program via the process' stdin and then read the output via stdout.
To retrieve the input parameters and return values of func
, I need to
- Find the line number of the first line of
func
by analyzing its file - Send a breakpoint command for this file/lineno
- Send continue command
- Import pickle, serialize
locals()
, and print to stdout (to get input parameters) - Send return command (go to end of function)
- Serialize
__return__
and print to stdout - Send continue command
I'm wondering if there is a better way to accomplish this
答案1
得分: 2
不使用管道控制pdb,你可以在执行import main
之前使用sys.settrace
配置自己的跟踪函数(文档链接)。当然,你也可以使用importlib.import_module("main")
、runpy.run_module()
或runpy.run_path()
。
例如,
import sys
def trace(frame, event, args):
if event == "call":
print(frame.f_code.co_name, frame.f_locals)
sys.settrace(trace)
# (这里你可以执行`import main`来把控制权交给它)
def func(a, b, c):
return a + b + c
func(1, 2, 3)
func("a", "b", "c")
输出结果为:
func {'a': 1, 'b': 2, 'c': 3}
func {'a': 'a', 'b': 'b', 'c': 'c'}
英文:
Instead of controlling pdb with pipes, you can just configure your own trace function using sys.settrace
before doing import main
. (Of course you can also do importlib.import_module("main")
or runpy.run_module()
or runpy.run_path()
.)
For instance,
import sys
def trace(frame, event, args):
if event == "call":
print(frame.f_code.co_name, frame.f_locals)
sys.settrace(trace)
# (this is where you'd `import main` to cede control to it)
def func(a, b, c):
return a + b + c
func(1, 2, 3)
func("a", "b", "c")
prints out
func {'a': 1, 'b': 2, 'c': 3}
func {'a': 'a', 'b': 'b', 'c': 'c'}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论