英文:
ebpf kprobe argument not matching the syscall
问题
I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...
我正在学习eBPF,并且正在使用它以更好地理解它,同时按照文档进行操作,但有一些我不明白为什么它不起作用...
I have this very simple code that stops the code and returns 5.
我有这个非常简单的代码,它停止执行并返回5。
int main() {
exit(5);
return 0;
}
The exit
function from the code above calls the exit_group
syscall as can we can see by using strace
(image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk
is the value 208682672
and not the value 5
that the exit_group syscall is called with as I was expecting...
上面代码中的exit
函数调用了exit_group
系统调用,正如我们可以通过使用strace
(下面的图像)看到的那样,但是在我的Python代码中,通过bcc使用eBPF时,我的bpf_trace_printk
输出的值是208682672
,而不是我预期的5
值,这是exit_group系统调用的参数...
from bcc import BPF
def main():
bpftext = """
#include <uapi/linux/ptrace.h>
void my_exit(struct pt_regs *ctx, int status){
bpf_trace_printk("%d", status);
}
"""
bpf = BPF(text=bpftext)
fname = bpf.get_syscall_fnname('exit_group')
bpf.attach_kprobe(event=fname, fn_name='my_exit')
while True:
print(bpf.trace_fields())
if name == 'main':
main()
I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...
我已经查阅了我在线找到的所有资料,但是我找不到解决方案,因为我已经调查了这个问题几天了...
I truly appreciate any help available and thank you!
我真的非常感谢任何可用的帮助,谢谢!
英文:
I'm learning eBPF and I'm playing with it in order to understand it better while following the docs but there's something I don't understand why it's not working...
I have this very simple code that stops the code and returns 5.
int main() {
exit(5);
return 0;
}
The exit
function from the code above calls the exit_group
syscall as can we can see by using strace
(image below) yet within my Python code that's using eBPF through bcc the output I get for my bpf_trace_printk
is the value 208682672
and not the value 5
that the exit_group syscall is called with as I was expecting...
from bcc import BPF
def main():
bpftext = """
#include <uapi/linux/ptrace.h>
void my_exit(struct pt_regs *ctx, int status){
bpf_trace_printk("%d", status);
}
"""
bpf = BPF(text=bpftext)
fname = bpf.get_syscall_fnname('exit_group')
bpf.attach_kprobe(event=fname, fn_name='my_exit')
while True:
print(bpf.trace_fields())
if __name__ == '__main__':
main()
I've looked into whatever I found online but I couldn't find a solution as I've been investigating this problem for a few days now...
I truly appreciate any help available and thank you!
答案1
得分: 1
I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx
you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx
) or manually accessing the structure fields.
The first syscall argument can be extracted with PT_REGS_PARM1
:
bpftext = """
#include <uapi/linux/ptrace.h>
void my_exit(struct pt_regs *ctx){
bpf_trace_printk("%ld\\n", PT_REGS_PARM1(ctx));
}
"""
英文:
I am not sure if your probe function should take 3 arguments. They seem to many. In any case, the struct pt_regs *ctx
you have should already hold any information you need. You should be able to read any register value through dedicated macros (PT_REGS_xxx
) or manually accessing the structure fields.
The first syscall argument can be extracted with PT_REGS_PARM1
:
bpftext = """
#include <uapi/linux/ptrace.h>
void my_exit(struct pt_regs *ctx){
bpf_trace_printk("%ld\\n", PT_REGS_PARM1(ctx));
}
"""
答案2
得分: 1
修复
你需要将你的函数从 my_exit
重命名为 syscall__exit_group
。
为什么这很重要?以这种方式命名的BPF程序会受到BCC的特殊处理。以下是文档的内容:
8. 系统调用跟踪点
语法:
syscall__SYSCALLNAME
syscall__
是一个特殊前缀,用于创建提供的系统调用名称的kprobe。您可以通过声明一个普通的C函数,然后使用Python的BPF.get_syscall_fnname(SYSCALLNAME)
和BPF.attach_kprobe()
来关联它。参数在函数声明中指定:
syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...])
。例如:
int syscall__execve(struct pt_regs *ctx, const char __user *filename, const char __user *const __user *__argv, const char __user *const __user *__envp) { [...] }
这个函数用于监视execve系统调用。
来源。
修正的代码
from bcc import BPF
def main():
bpftext = """
#include <uapi/linux/ptrace.h>
void syscall__exit_group(struct pt_regs *ctx, int status){
bpf_trace_printk("%d", status);
}
"""
bpf = BPF(text=bpftext)
fname = bpf.get_syscall_fnname('exit_group')
bpf.attach_kprobe(event=fname, fn_name='syscall__exit_group')
while True:
print(bpf.trace_fields())
if __name__ == '__main__':
main()
从示例程序退出的输出:
(b'<...>', 14896, 0, b'd...1', 3996.079261, b'5')
工作原理
在BCC转换你的BPF程序之后,这导致了传递参数的略有不同的解释。你可以使用 bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR)
来查看代码是如何被转换的。
以下是没有 syscall__
前缀时发生的情况:
void my_exit(struct pt_regs *ctx){
int status = ctx->di;
({ char _fmt[] = "%d"; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
}
这读取RDI寄存器并将其解释为系统调用参数。
另一方面,如果命名为 syscall__exit_group
,则发生以下情况:
void syscall__exit_group(struct pt_regs *ctx){
#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) && !defined(__s390x__)
struct pt_regs * __ctx = ctx->di;
int status; bpf_probe_read(&status, sizeof(status), &__ctx->di);
#else
int status = ctx->di;
#endif
({ char _fmt[] = "%d"; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
}
如果定义了 CONFIG_ARCH_HAS_SYSCALL_WRAPPER
(在x86_64上是这样),那么RDI寄存器将被解释为指向 struct pt_regs
的指针,该结构查找其中的RDI寄存器,这是 exit_group()
的第一个参数。
在没有系统调用包装器的系统上,这与前一个示例执行相同的操作。
英文:
Fix
You need to rename your function from my_exit
to syscall__exit_group
.
Why does this matter? BPF programs named in this way get special handling from BCC. Here's what the documentation says:
> #### 8. system call tracepoints
>
> Syntax: syscall__SYSCALLNAME
>
> syscall__
is a special prefix that creates a kprobe for the
> system call name provided as the remainder. You can use it by
> declaring a normal C function, then using the Python
> BPF.get_syscall_fnname(SYSCALLNAME)
and
> BPF.attach_kprobe()
to associate it.
>
> Arguments are specified on the function declaration:
> syscall__SYSCALLNAME(struct pt_regs *ctx, [, argument1 ...])
.
>
> For example:
>
> int syscall__execve(struct pt_regs *ctx,
> const char __user *filename,
> const char __user *const __user *__argv,
> const char __user *const __user *__envp)
> {
> [...]
> }
>
>
> This instruments the execve system call.
Corrected Code
from bcc import BPF
def main():
bpftext = """
#include <uapi/linux/ptrace.h>
void syscall__exit_group(struct pt_regs *ctx, int status){
bpf_trace_printk("%d", status);
}
"""
bpf = BPF(text=bpftext)
fname = bpf.get_syscall_fnname('exit_group')
bpf.attach_kprobe(event=fname, fn_name='syscall__exit_group')
while True:
print(bpf.trace_fields())
if __name__ == '__main__':
main()
Output from the sample program exiting:
(b'<...>', 14896, 0, b'd...1', 3996.079261, b'5')
How it Works
After BCC transforms your BPF program, this results in a slightly different interpretation of the arguments passed. You can use bpf = BPF(text=bpftext, debug=bcc.DEBUG_PREPROCESSOR)
to see how your code is transformed.
Here's what happens without the syscall__
prefix:
void my_exit(struct pt_regs *ctx){
int status = ctx->di;
({ char _fmt[] = "%d"; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
}
This reads in the RDI register and interprets it as the syscall argument.
On the other hand, here's what happens if it's named syscall__exit_group
:
void syscall__exit_group(struct pt_regs *ctx){
#if defined(CONFIG_ARCH_HAS_SYSCALL_WRAPPER) && !defined(__s390x__)
struct pt_regs * __ctx = ctx->di;
int status; bpf_probe_read(&status, sizeof(status), &__ctx->di);
#else
int status = ctx->di;
#endif
({ char _fmt[] = "%d"; bpf_trace_printk_(_fmt, sizeof(_fmt), status); });
}
If the CONFIG_ARCH_HAS_SYSCALL_WRAPPER
is defined (it is on x86_64) then the RDI register is interpreted as a pointer to a struct pt_regs
, which looks up the RDI register in that, which is the first argument to exit_group()
.
On systems without syscall wrappers, this does the same thing as the previous example.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论