ebpf kprobe argument not matching the syscall

huangapple go评论81阅读模式
英文:

ebpf kprobe argument not matching the syscall

问题

I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...

我正在学习eBPF,并且正在使用它以更好地理解它,同时按照文档进行操作,但有一些我不明白为什么它不起作用...

I have this very simple code that stops the code and returns 5.

我有这个非常简单的代码,它停止执行并返回5。

int main() {
exit(5);
return 0;
}

The exit function from the code above calls the exit_group syscall as can we can see by using strace (image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk is the value 208682672 and not the value 5 that the exit_group syscall is called with as I was expecting...

上面代码中的exit函数调用了exit_group系统调用,正如我们可以通过使用strace(下面的图像)看到的那样,但是在我的Python代码中,通过bcc使用eBPF时,我的bpf_trace_printk输出的值是208682672,而不是我预期的5值,这是exit_group系统调用的参数...

from bcc import BPF

def main():
bpftext = """
#include <uapi/linux/ptrace.h>

void my_exit(struct pt_regs *ctx, int status){
    bpf_trace_printk(&quot;%d&quot;, status);
}
&quot;&quot;&quot;

bpf = BPF(text=bpftext)
fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
bpf.attach_kprobe(event=fname, fn_name=&#39;my_exit&#39;)

while True:
    print(bpf.trace_fields())

if name == 'main':
main()

I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...

我已经查阅了我在线找到的所有资料,但是我找不到解决方案,因为我已经调查了这个问题几天了...

I truly appreciate any help available and thank you!

我真的非常感谢任何可用的帮助,谢谢!

英文:

I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...

I have this very simple code that stops the code and returns 5.

int main() {
   exit(5);
   return 0;
}

The exit function from the code above calls the exit_group syscall as can we can see by using strace (image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk is the value 208682672 and not the value 5 that the exit_group syscall is called with as I was expecting...

ebpf kprobe argument not matching the syscall

from bcc import BPF

def main():
    bpftext = &quot;&quot;&quot;
    #include &lt;uapi/linux/ptrace.h&gt;

    void my_exit(struct pt_regs *ctx, int status){
        bpf_trace_printk(&quot;%d&quot;, status);
    }
    &quot;&quot;&quot;

    bpf = BPF(text=bpftext)
    fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
    bpf.attach_kprobe(event=fname, fn_name=&#39;my_exit&#39;)

    while True:
        print(bpf.trace_fields())


if __name__ == &#39;__main__&#39;:
    main()

I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...

I truly appreciate any help available and thank you!

答案1

得分: 1

I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx) or manually accessing the structure fields.

The first syscall argument can be extracted with PT_REGS_PARM1:

    bpftext = &quot;&quot;&quot;
    #include &lt;uapi/linux/ptrace.h&gt;

    void my_exit(struct pt_regs *ctx){
        bpf_trace_printk(&quot;%ld\\n&quot;, PT_REGS_PARM1(ctx));
    }
    &quot;&quot;&quot;
英文:

I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx) or manually accessing the structure fields.

The first syscall argument can be extracted with PT_REGS_PARM1:

    bpftext = &quot;&quot;&quot;
    #include &lt;uapi/linux/ptrace.h&gt;

    void my_exit(struct pt_regs *ctx){
        bpf_trace_printk(&quot;%ld\\n&quot;, PT_REGS_PARM1(ctx));
    }
    &quot;&quot;&quot;

答案2

得分: 1

修复

你需要将你的函数从 my_exit 重命名为 syscall__exit_group

为什么这很重要?以这种方式命名的BPF程序会受到BCC的特殊处理。以下是文档的内容:

8. 系统调用跟踪点

语法: syscall__SYSCALLNAME

syscall__ 是一个特殊前缀,用于创建提供的系统调用名称的kprobe。您可以通过声明一个普通的C函数,然后使用Python的 BPF.get_syscall_fnname(SYSCALLNAME)BPF.attach_kprobe() 来关联它。

参数在函数声明中指定:
syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...])

例如:

int syscall__execve(struct pt_regs *ctx,
    const char __user *filename,
    const char __user *const __user *__argv,
    const char __user *const __user *__envp)
{
    [...]
}

这个函数用于监视execve系统调用。

来源

修正的代码

from bcc import BPF

def main():
    bpftext = &quot;&quot;&quot;
    #include &lt;uapi/linux/ptrace.h&gt;

    void syscall__exit_group(struct pt_regs *ctx, int status){
        bpf_trace_printk(&quot;%d&quot;, status);
    }
    &quot;&quot;&quot;

    bpf = BPF(text=bpftext)
    fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
    bpf.attach_kprobe(event=fname, fn_name=&#39;syscall__exit_group&#39;)

    while True:
        print(bpf.trace_fields())


if __name__ == &#39;__main__&#39;:
    main()

从示例程序退出的输出:

(b&#39;&lt;...&gt;&#39;, 14896, 0, b&#39;d...1&#39;, 3996.079261, b&#39;5&#39;)

工作原理

在BCC转换你的BPF程序之后,这导致了传递参数的略有不同的解释。你可以使用 bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR) 来查看代码是如何被转换的。

以下是没有 syscall__ 前缀时发生的情况:

void my_exit(struct pt_regs *ctx){
 int status = ctx-&gt;di;
        ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
    }

这读取RDI寄存器并将其解释为系统调用参数。

另一方面,如果命名为 syscall__exit_group,则发生以下情况:

void syscall__exit_group(struct pt_regs *ctx){
#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) &amp;&amp; !defined(__s390x__)
 struct pt_regs * __ctx = ctx-&gt;di;
 int status; bpf_probe_read(&amp;status, sizeof(status), &amp;__ctx-&gt;di);
#else
 int status = ctx-&gt;di;
#endif

        ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
    }

如果定义了 CONFIG_ARCH_HAS_SYSCALL_WRAPPER(在x86_64上是这样),那么RDI寄存器将被解释为指向 struct pt_regs 的指针,该结构查找其中的RDI寄存器,这是 exit_group() 的第一个参数。

在没有系统调用包装器的系统上,这与前一个示例执行相同的操作。

英文:

Fix

You need to rename your function from my_exit to syscall__exit_group.

Why does this matter? BPF programs named in this way get special handling from BCC. Here's what the documentation says:

> #### 8. system call tracepoints
>
> Syntax: syscall__SYSCALLNAME
>
> syscall__ is a special prefix that creates a kprobe for the
> system call name provided as the remainder. You can use it by
> declaring a normal C function, then using the Python
> BPF.get_syscall_fnname(SYSCALLNAME) and
> BPF.attach_kprobe() to associate it.
>
> Arguments are specified on the function declaration:
> syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...]).
>
> For example:
>
&gt; int syscall__execve(struct pt_regs *ctx,
&gt; const char __user *filename,
&gt; const char __user *const __user *__argv,
&gt; const char __user *const __user *__envp)
&gt; {
&gt; [...]
&gt; }
&gt;

>
> This instruments the execve system call.

Source.

Corrected Code

from bcc import BPF

def main():
    bpftext = &quot;&quot;&quot;
    #include &lt;uapi/linux/ptrace.h&gt;

    void syscall__exit_group(struct pt_regs *ctx, int status){
        bpf_trace_printk(&quot;%d&quot;, status);
    }
    &quot;&quot;&quot;

    bpf = BPF(text=bpftext)
    fname = bpf.get_syscall_fnname(&#39;exit_group&#39;)
    bpf.attach_kprobe(event=fname, fn_name=&#39;syscall__exit_group&#39;)

    while True:
        print(bpf.trace_fields())


if __name__ == &#39;__main__&#39;:
    main()

Output from the sample program exiting:

(b&#39;&lt;...&gt;&#39;, 14896, 0, b&#39;d...1&#39;, 3996.079261, b&#39;5&#39;)

How it Works

After BCC transforms your BPF program, this results in a slightly different interpretation of the arguments passed. You can use bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR) to see how your code is transformed.

Here's what happens without the syscall__ prefix:

void my_exit(struct pt_regs *ctx){
 int status = ctx-&gt;di;
        ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
    }

This reads in the RDI register and interprets it as the syscall argument.

On the other hand, here's what happens if it's named syscall__exit_group:

void syscall__exit_group(struct pt_regs *ctx){
#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) &amp;&amp; !defined(__s390x__)
 struct pt_regs * __ctx = ctx-&gt;di;
 int status; bpf_probe_read(&amp;status, sizeof(status), &amp;__ctx-&gt;di);
#else
 int status = ctx-&gt;di;
#endif

        ({ char _fmt[] = &quot;%d&quot;; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
    }

If the CONFIG_ARCH_HAS_SYSCALL_WRAPPER is defined (it is on x86_64) then the RDI register is interpreted as a pointer to a struct pt_regs, which looks up the RDI register in that, which is the first argument to exit_group().

On systems without syscall wrappers, this does the same thing as the previous example.

huangapple
  • 本文由 发表于 2023年5月25日 08:20:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76328152.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定