ebpf kprobe argument not matching the syscall

huangapple go评论110阅读模式
英文:

ebpf kprobe argument not matching the syscall

问题

I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...

我正在学习eBPF,并且正在使用它以更好地理解它,同时按照文档进行操作,但有一些我不明白为什么它不起作用...

I have this very simple code that stops the code and returns 5.

我有这个非常简单的代码,它停止执行并返回5。

int main() {
exit(5);
return 0;
}

The exit function from the code above calls the exit_group syscall as can we can see by using strace (image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk is the value 208682672 and not the value 5 that the exit_group syscall is called with as I was expecting...

上面代码中的exit函数调用了exit_group系统调用,正如我们可以通过使用strace(下面的图像)看到的那样,但是在我的Python代码中,通过bcc使用eBPF时,我的bpf_trace_printk输出的值是208682672,而不是我预期的5值,这是exit_group系统调用的参数...

from bcc import BPF

def main():
bpftext = """
#include <uapi/linux/ptrace.h>

  1. void my_exit(struct pt_regs *ctx, int status){
  2. bpf_trace_printk(&quot;%d&quot;, status);
  3. }
  4. &quot;&quot;&quot;
  5. bpf = BPF(text=bpftext)
  6. fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
  7. bpf.attach_kprobe(event=fname, fn_name=&#39;my_exit&#39;)
  8. while True:
  9. print(bpf.trace_fields())

if name == 'main':
main()

I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...

我已经查阅了我在线找到的所有资料,但是我找不到解决方案,因为我已经调查了这个问题几天了...

I truly appreciate any help available and thank you!

我真的非常感谢任何可用的帮助,谢谢!

英文:

I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...

I have this very simple code that stops the code and returns 5.

  1. int main() {
  2. exit(5);
  3. return 0;
  4. }

The exit function from the code above calls the exit_group syscall as can we can see by using strace (image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk is the value 208682672 and not the value 5 that the exit_group syscall is called with as I was expecting...

ebpf kprobe argument not matching the syscall

  1. from bcc import BPF
  2. def main():
  3. bpftext = &quot;&quot;&quot;
  4. #include &lt;uapi/linux/ptrace.h&gt;
  5. void my_exit(struct pt_regs *ctx, int status){
  6. bpf_trace_printk(&quot;%d&quot;, status);
  7. }
  8. &quot;&quot;&quot;
  9. bpf = BPF(text=bpftext)
  10. fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
  11. bpf.attach_kprobe(event=fname, fn_name=&#39;my_exit&#39;)
  12. while True:
  13. print(bpf.trace_fields())
  14. if __name__ == &#39;__main__&#39;:
  15. main()

I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...

I truly appreciate any help available and thank you!

答案1

得分: 1

I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx) or manually accessing the structure fields.

The first syscall argument can be extracted with PT_REGS_PARM1:

  1. bpftext = &quot;&quot;&quot;
  2. #include &lt;uapi/linux/ptrace.h&gt;
  3. void my_exit(struct pt_regs *ctx){
  4. bpf_trace_printk(&quot;%ld\\n&quot;, PT_REGS_PARM1(ctx));
  5. }
  6. &quot;&quot;&quot;
英文:

I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx) or manually accessing the structure fields.

The first syscall argument can be extracted with PT_REGS_PARM1:

  1. bpftext = &quot;&quot;&quot;
  2. #include &lt;uapi/linux/ptrace.h&gt;
  3. void my_exit(struct pt_regs *ctx){
  4. bpf_trace_printk(&quot;%ld\\n&quot;, PT_REGS_PARM1(ctx));
  5. }
  6. &quot;&quot;&quot;

答案2

得分: 1

修复

你需要将你的函数从 my_exit 重命名为 syscall__exit_group

为什么这很重要?以这种方式命名的BPF程序会受到BCC的特殊处理。以下是文档的内容:

8. 系统调用跟踪点

语法: syscall__SYSCALLNAME

syscall__ 是一个特殊前缀,用于创建提供的系统调用名称的kprobe。您可以通过声明一个普通的C函数,然后使用Python的 BPF.get_syscall_fnname(SYSCALLNAME)BPF.attach_kprobe() 来关联它。

参数在函数声明中指定:
syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...])

例如:

  1. int syscall__execve(struct pt_regs *ctx,
  2. const char __user *filename,
  3. const char __user *const __user *__argv,
  4. const char __user *const __user *__envp)
  5. {
  6. [...]
  7. }

这个函数用于监视execve系统调用。

来源

修正的代码

  1. from bcc import BPF
  2. def main():
  3. bpftext = &quot;&quot;&quot;
  4. #include &lt;uapi/linux/ptrace.h&gt;
  5. void syscall__exit_group(struct pt_regs *ctx, int status){
  6. bpf_trace_printk(&quot;%d&quot;, status);
  7. }
  8. &quot;&quot;&quot;
  9. bpf = BPF(text=bpftext)
  10. fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
  11. bpf.attach_kprobe(event=fname, fn_name=&#39;syscall__exit_group&#39;)
  12. while True:
  13. print(bpf.trace_fields())
  14. if __name__ == &#39;__main__&#39;:
  15. main()

从示例程序退出的输出:

  1. (b&#39;&lt;...&gt;&#39;, 14896, 0, b&#39;d...1&#39;, 3996.079261, b&#39;5&#39;)

工作原理

在BCC转换你的BPF程序之后,这导致了传递参数的略有不同的解释。你可以使用 bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR) 来查看代码是如何被转换的。

以下是没有 syscall__ 前缀时发生的情况:

  1. void my_exit(struct pt_regs *ctx){
  2. int status = ctx-&gt;di;
  3. ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
  4. }

这读取RDI寄存器并将其解释为系统调用参数。

另一方面,如果命名为 syscall__exit_group,则发生以下情况:

  1. void syscall__exit_group(struct pt_regs *ctx){
  2. #if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) &amp;&amp; !defined(__s390x__)
  3. struct pt_regs * __ctx = ctx-&gt;di;
  4. int status; bpf_probe_read(&amp;status, sizeof(status), &amp;__ctx-&gt;di);
  5. #else
  6. int status = ctx-&gt;di;
  7. #endif
  8. ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
  9. }

如果定义了 CONFIG_ARCH_HAS_SYSCALL_WRAPPER(在x86_64上是这样),那么RDI寄存器将被解释为指向 struct pt_regs 的指针,该结构查找其中的RDI寄存器,这是 exit_group() 的第一个参数。

在没有系统调用包装器的系统上,这与前一个示例执行相同的操作。

英文:

Fix

You need to rename your function from my_exit to syscall__exit_group.

Why does this matter? BPF programs named in this way get special handling from BCC. Here's what the documentation says:

> #### 8. system call tracepoints
>
> Syntax: syscall__SYSCALLNAME
>
> syscall__ is a special prefix that creates a kprobe for the
> system call name provided as the remainder. You can use it by
> declaring a normal C function, then using the Python
> BPF.get_syscall_fnname(SYSCALLNAME) and
> BPF.attach_kprobe() to associate it.
>
> Arguments are specified on the function declaration:
> syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...]).
>
> For example:
>
&gt; int syscall__execve(struct pt_regs *ctx,
&gt; const char __user *filename,
&gt; const char __user *const __user *__argv,
&gt; const char __user *const __user *__envp)
&gt; {
&gt; [...]
&gt; }
&gt;

>
> This instruments the execve system call.

Source.

Corrected Code

  1. from bcc import BPF
  2. def main():
  3. bpftext = &quot;&quot;&quot;
  4. #include &lt;uapi/linux/ptrace.h&gt;
  5. void syscall__exit_group(struct pt_regs *ctx, int status){
  6. bpf_trace_printk(&quot;%d&quot;, status);
  7. }
  8. &quot;&quot;&quot;
  9. bpf = BPF(text=bpftext)
  10. fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
  11. bpf.attach_kprobe(event=fname, fn_name=&#39;syscall__exit_group&#39;)
  12. while True:
  13. print(bpf.trace_fields())
  14. if __name__ == &#39;__main__&#39;:
  15. main()

Output from the sample program exiting:

  1. (b&#39;&lt;...&gt;&#39;, 14896, 0, b&#39;d...1&#39;, 3996.079261, b&#39;5&#39;)

How it Works

After BCC transforms your BPF program, this results in a slightly different interpretation of the arguments passed. You can use bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR) to see how your code is transformed.

Here's what happens without the syscall__ prefix:

  1. void my_exit(struct pt_regs *ctx){
  2. int status = ctx-&gt;di;
  3. ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
  4. }

This reads in the RDI register and interprets it as the syscall argument.

On the other hand, here's what happens if it's named syscall__exit_group:

  1. void syscall__exit_group(struct pt_regs *ctx){
  2. #if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) &amp;&amp; !defined(__s390x__)
  3. struct pt_regs * __ctx = ctx-&gt;di;
  4. int status; bpf_probe_read(&amp;status, sizeof(status), &amp;__ctx-&gt;di);
  5. #else
  6. int status = ctx-&gt;di;
  7. #endif
  8. ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
  9. }

If the CONFIG_ARCH_HAS_SYSCALL_WRAPPER is defined (it is on x86_64) then the RDI register is interpreted as a pointer to a struct pt_regs, which looks up the RDI register in that, which is the first argument to exit_group().

On systems without syscall wrappers, this does the same thing as the previous example.

huangapple
  • 本文由 发表于 2023年5月25日 08:20:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76328152.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定