2023年2月17日 23:37:38go评论72阅读模式

英文:

How to use cuda-gdb python debugging?

问题

我写了一个名为hello.py的简单文件

print("hi")

然后运行

cuda-gdb python3 hello.py

我得到以下结果:

正在读取来自python3的符号...
(未在python3中找到调试符号)
"/home/x/Desktop/py_projects/hello.py" 不是核心转储文件：无法识别文件格式

如果我在Python代码中调用CUDA函数，如何调试呢？

英文:

I write a hello.py that is simply

print(&quot;hi&quot;)

and then run

cuda-gdb python3 hello.py

I get:

Reading symbols from python3...
(No debugging symbols found in python3)
&quot;/home/x/Desktop/py_projects/hello.py&quot; is not a core dump: file format not recognized

How to debug if I call cuda functions in python code?

答案1

得分: 1

可以在Python中使用cuda-gdb，假设你只需要调试C/C++部分。我不知道有没有一个调试器可以从调试Python跳到调试CUDA C++。以下是一种可能的方法，参考了这里的内容。

要调试从Python调用的CUDA C/C++库函数，以下是一种可能的方法，灵感来自这篇文章。

在本教程中，我将直接使用来自这个答案的t383.py和t383.cu文件，并在CentOS7上使用CUDA 10，Python 2.7.5。
使用-G和-g开关编译CUDA C/C++库，就像进行普通调试一样：

$ nvcc -Xcompiler -fPIC -std=c++11 -shared -arch=sm_60 -G -g -o t383.so t383.cu -DFIX

我们将需要两个终端会话。我将分别称它们为会话1和会话2。在会话1中启动你的Python解释器：

$ python
...
>>>

在会话2中，找到与你的Python解释器关联的进程ID（将USER替换为你的实际用户名）：

$ ps -ef |grep USER
...
USER    23221 22694  0 23:55 pts/0    00:00:00 python
...
$

在上面的示例中，23221是Python解释器的进程ID（使用man ps获取帮助）。

在会话2中，启动cuda-gdb以便附加到该进程ID：

$ cuda-gdb -p 23221
...（这里会有大量输出）
(cuda-gdb)

在会话2中，在(cuda-gdb)提示符下，在你的CUDA C/C++库中的所需位置设置断点。例如，在t383.cu文件中的第一行内核代码，即第70行。如果你还没有加载库（在本教程中我们还没有），那么cuda-gdb会指出并询问你是否希望将断点挂起，直到将来加载库。回答y（或者，在开始这个cuda-gdb会话之前，你可以在解释器内部运行你的Python脚本一次，就像我们将在步骤7中做的那样。这将加载库的符号表并避免此提示）。设置完断点后，我们将在cuda-gdb中使用continue命令，以使Python解释器继续运行：

(cuda-gdb) break t383.cu:70
No symbol table is loaded.  Use the "file" command.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (t383.cu:70) pending.
(cuda-gdb) continue
Continuing.

在会话1中运行你的Python脚本：

>>> execfile("t383.py")
init terrain_height_map
1, 1, 1, 1, 1,
1, 1, 1, 1,
1, 1, 1, 1,

现在，你的Python解释器已经停止（并且无响应），因为在会话2中，我们看到断点已经被触发：

[New Thread 0x7fdb0ffff700 (LWP 23589)]
[New Thread 0x7fdb0f7fe700 (LWP 23590)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 "python" hit Breakpoint 1, update_water_flow<<<(1,1,1),(1024,1,1)>>>(
    water_height_map=0x80080000800000, water_flow_map=0xfffcc000800600,
    d_updated_water_flow_map=0x7fdb00800800, SIZE_X=4, SIZE_Y=4) at t383.cu:70
70          int col = index % SIZE_X;
(cuda-gdb)

我们看到断点位于库（内核）代码的第70行，正如我们所期望的那样。此时，可以在会话2内继续进行普通的C/C++ cuda-gdb调试，只要你保持在库函数内。

当你完成调试时（可能需要删除设置的任何断点），你可以在会话2中再次键入continue，以允许控制返回到会话1中的Python解释器，并使你的应用程序完成。

英文:

Its possible to use cuda-gdb from python, assuming you only need to debug the C/C++ portion. I don't know of a debugger that can jump from debugging python to debugging CUDA C++. Here is one possible approach, a copy of what is presented here.

To debug a CUDA C/C++ library function called from python, the following is one possibility, inspired from this article.

For this walk through, I will use the t383.py and t383.cu files verbatim from this answer, and I'll be using CUDA 10, python 2.7.5, on CentOS7
Compile your CUDA C/C++ library using the -G and -g switches, as you would to do ordinary debug:

$ nvcc -Xcompiler -fPIC -std=c++11 -shared -arch=sm_60 -G -g -o t383.so t383.cu -DFIX

We'll need two terminal sessions for this. I will refer to them as session 1 and session 2. In session 1, start your python interpreter:

$ python
...
&gt;&gt;&gt;

In session 2, find the process ID associated with your python interpreter (replace USER with your actual username):

$ ps -ef |grep USER
...
USER    23221 22694  0 23:55 pts/0    00:00:00 python
...
$

In the above example, 23221 is the process ID for the python interpreter (use man ps for help)

In session 2, start cuda-gdb so as to attach to that process ID:

$ cuda-gdb -p 23221
... (lots of spew here)
(cuda-gdb)

In session 2, at the (cuda-gdb) prompt, set a breakpoint at a desired location in your CUDA C/C++ library. For this example, we will set a breakpoint at one of the first lines of kernel code, line 70 in the t383.cu file. If you haven't yet loaded the library (we haven't, in this walk through), then cuda-gdb will point this out and ask you if you want to make the breakpoint pending on a future library load. Answer y to this (alternatively, before starting this cuda-gdb session, you could have run your python script once from within the interpreter, as we will do in step 7 below. This would load the symbol table for the library and avoid this prompt). After the breakpoint is set, we will issue the continue command in cuda-gdb in order to get the python interpreter running again:

(cuda-gdb) break t383.cu:70
No symbol table is loaded.  Use the &quot;file&quot; command.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (t383.cu:70) pending.
(cuda-gdb) continue
Continuing.

In session 1, run your python script:

&gt;&gt;&gt; execfile(&quot;t383.py&quot;)
init terrain_height_map
1, 1, 1, 1, 1,
1, 1, 1, 1,
1, 1, 1, 1,

our python interpreter has now halted (and is unresponsive), because in session 2 we see that the breakpoint has been hit:

[New Thread 0x7fdb0ffff700 (LWP 23589)]
[New Thread 0x7fdb0f7fe700 (LWP 23590)]
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]

Thread 1 &quot;python&quot; hit Breakpoint 1, update_water_flow&lt;&lt;&lt;(1,1,1),(1024,1,1)&gt;&gt;&gt; (
    water_height_map=0x80080000800000, water_flow_map=0xfffcc000800600,
    d_updated_water_flow_map=0x7fdb00800800, SIZE_X=4, SIZE_Y=4) at t383.cu:70
70          int col = index % SIZE_X;
(cuda-gdb)

and we see that the breakpoint is at line 70 of our library (kernel) code, just as expected. ordinary C/C++ cuda-gdb debug can proceed at this point within session 2, as long as you stay within the library function.

When you are finished debugging (you may need to remove any breakpoints set) you can once again type continue in session 2, to allow control to return to the python interpreter in session 1, and for your application to finish.

答案2

得分: 1

为了补充Robert的答案，如果你正在使用CUDA-Python，你可以使用--args选项来传递包含参数的命令行。例如，这是一个有效的命令行：

$ cuda-gdb --args python3 hello.py

你的原始命令无效，因为没有--args，cuda-gdb接受一个主机核心转储文件作为参数。

以下是使用来自CUDA-Python存储库的示例的完整命令行：

$ cuda-gdb -q --args python3 simpleCubemapTexture_test.py
Reading symbols from python3...
(No debugging symbols found in python3)
(cuda-gdb) set cuda break_on_launch application
(cuda-gdb) run
Starting program: /usr/bin/python3 simpleCubemapTexture_test.py.
...
[切换到CUDA内核0，网格1，块(0,0,0)，线程(0,0,0)，设备0，sm 0，warp 0，lane 0]
0x00007fff67858600 in transformKernel&lt;&lt;&lt;(8,8,1),(8,8,1)&gt;&gt;&gt; ()
(cuda-gdb) p $pc
$1 = (void (*)()) 0x7fff67858600 &lt;transformKernel&gt;
(cuda-gdb) bt
#0  0x00007fff67858600 in transformKernel&lt;&lt;&lt;(8,8,1),(8,8,1)&gt;&gt;&gt; ()

英文:

To complete Robert's answer, if you are using CUDA-Python, you can use option --args in order to pass a command-line that contains arguments. For example, this is a valid command-line:

$ cuda-gdb --args python3 hello.py

Your original command is not valid because, without --args, cuda-gdb takes in parameter a host coredump file.

Here is the complete command line with an example from the CUDA-Python repository:

$ cuda-gdb -q --args python3 simpleCubemapTexture_test.py
Reading symbols from python3...
(No debugging symbols found in python3)
(cuda-gdb) set cuda break_on_launch application
(cuda-gdb) run
Starting program: /usr/bin/python3 simpleCubemapTexture_test.py.
...
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
0x00007fff67858600 in transformKernel&lt;&lt;&lt;(8,8,1),(8,8,1)&gt;&gt;&gt; ()
(cuda-gdb) p $pc
$1 = (void (*)()) 0x7fff67858600 &lt;transformKernel&gt;
(cuda-gdb) bt
#0  0x00007fff67858600 in transformKernel&lt;&lt;&lt;(8,8,1),(8,8,1)&gt;&gt;&gt; ()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用cuda-gdb进行Python调试？

问题

答案1

答案2

将3个字典列表（具有相同的值）分组。

有关Python中的管道操作符是否有任何PEP？

你为什么在这个IF语句中得到了一个语法错误？

Fastapi独立应用与WebSocket监听不起作用。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论