将eBPF sockops程序附加到特定容器的cgroup

huangapple go评论96阅读模式
英文:

Attach eBPF sockops program to a specific container cgroup

问题

我想将一个eBPF sockops程序附加到特定的Kubernetes Pod。我正在使用以下方式的bpf_prog_attach()辅助函数:

  1. err = bpf_prog_attach(sockops_prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0);

以下是我附加到SOCKOPS挂钩的BPF程序:

  1. #include <linux/in.h>
  2. #include <linux/tcp.h>
  3. #include <linux/bpf.h>
  4. #include <sys/socket.h>
  5. #include <bpf/bpf_endian.h>
  6. #include <bpf/bpf_helpers.h>
  7. char LICENSE[] SEC("license") = "GPL";
  8. // sock_ops_map将sock_ops键映射到套接字描述符
  9. struct {
  10. __uint(type, BPF_MAP_TYPE_SOCKHASH);
  11. __uint(max_entries, 65535);
  12. __type(key, struct sock_key);
  13. __type(value, __u64);
  14. } sock_ops_map SEC(".maps");
  15. // `sock_key'是sockmap的键
  16. struct sock_key {
  17. __u32 sip4;
  18. __u32 dip4;
  19. __u32 sport;
  20. __u32 dport;
  21. } __attribute__((packed));
  22. // `sk_extract_key'从`bpf_sock_ops'结构中提取键
  23. static inline void sk_extract_key(struct bpf_sock_ops *ops,
  24. struct sock_key *key) {
  25. key->dip4 = ops->remote_ip4;
  26. key->sip4 = ops->local_ip4;
  27. key->sport = (bpf_htonl(ops->local_port) >> 16);
  28. key->dport = ops->remote_port >> 16;
  29. }
  30. SEC("sockops")
  31. int bpf_add_to_sockhash(struct bpf_sock_ops *skops) {
  32. __u32 family, op;
  33. family = skops->family;
  34. op = skops->op;
  35. bpf_printk("Got new operation %d for socket.\n", op);
  36. switch (op) {
  37. case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
  38. case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
  39. if (family == AF_INET) {
  40. struct sock_key key = {};
  41. sk_extract_key(skops, &key);
  42. int ret = bpf_sock_hash_update(skops, &sock_ops_map, &key, BPF_NOEXIST);
  43. if (ret != 0) {
  44. bpf_printk("Failed to update sockmap: %d\n", ret);
  45. } else {
  46. bpf_printk("Added new socket to sockmap\n");
  47. }
  48. }
  49. break;
  50. default:
  51. break;
  52. }
  53. return 0;
  54. }

在上述代码中,当我提供cgroup_fd/sys/fs/cgroup/unified cgroup时,程序可以正常工作 - eBPF程序被加载,并且打印语句正常工作。

然而,当我使用特定的Kubernetes Pod cgroup(使用cgroup_fd/sys/fs/cgroup/unified/kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:30a47e8e847277317a29ff7bdcf5bf03391ff79b847be647120d285f62a0f7e6)时,程序仍然成功附加,但我不会收到打印语句。

附加到子cgroup的SOCKOPS挂钩是否存在问题?还是特定Kubernetes Pod的cgroup与unified/中的不同?

英文:

I want to attach an eBPF sockops program to a specific kubernetes pod. I am using the bpf_prog_attach() helper as follows:

  1. err = bpf_prog_attach(sockops_prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0);

And here is the BPF program that I attach to the SOCKOPS hook:

  1. #include <linux/in.h>
  2. #include <linux/tcp.h>
  3. #include <linux/bpf.h>
  4. #include <sys/socket.h>
  5. #include <bpf/bpf_endian.h>
  6. #include <bpf/bpf_helpers.h>
  7. char LICENSE[] SEC("license") = "GPL";
  8. // sock_ops_map maps the sock_ops key to a socket descriptor
  9. struct {
  10. __uint(type, BPF_MAP_TYPE_SOCKHASH);
  11. __uint(max_entries, 65535);
  12. __type(key, struct sock_key);
  13. __type(value, __u64);
  14. } sock_ops_map SEC(".maps");
  15. // `sock_key' is a key for the sockmap
  16. struct sock_key {
  17. __u32 sip4;
  18. __u32 dip4;
  19. __u32 sport;
  20. __u32 dport;
  21. } __attribute__((packed));
  22. // `sk_extract_key' extracts the key from the `bpf_sock_ops' struct
  23. static inline void sk_extract_key(struct bpf_sock_ops *ops,
  24. struct sock_key *key) {
  25. key->dip4 = ops->remote_ip4;
  26. key->sip4 = ops->local_ip4;
  27. key->sport = (bpf_htonl(ops->local_port) >> 16);
  28. key->dport = ops->remote_port >> 16;
  29. }
  30. SEC("sockops")
  31. int bpf_add_to_sockhash(struct bpf_sock_ops *skops) {
  32. __u32 family, op;
  33. family = skops->family;
  34. op = skops->op;
  35. bpf_printk("Got new operation %d for socket.\n", op);
  36. switch (op) {
  37. case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
  38. case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
  39. if (family == AF_INET) {
  40. struct sock_key key = {};
  41. sk_extract_key(skops, &key);
  42. int ret = bpf_sock_hash_update(skops, &sock_ops_map, &key, BPF_NOEXIST);
  43. if (ret != 0) {
  44. bpf_printk("Failed to update sockmap: %d\n", ret);
  45. } else {
  46. bpf_printk("Added new socket to sockmap\n");
  47. }
  48. }
  49. break;
  50. default:
  51. break;
  52. }
  53. return 0;
  54. }

In above, when I provide the cgroup_fd for the /sys/fs/cgroup/unified cgroup, the program works - the eBPF program gets loaded, and the print statement works.

However, when I use the specific cgroup for a Kubernetes pod (using the cgroup_fd as /sys/fs/cgroup/unified/kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:30a47e8e847277317a29ff7bdcf5bf03391ff79b847be647120d285f62a0f7e6, then the program still attaches successfully but I don't get the print statements.

Is there a problem in attaching to the SOCKOPS hook for a child cgroup? Or is the cgroup for a specific kubernetes pod different from the one in unified/?

答案1

得分: 0

问题似乎出在目录名称上。在我的系统上,每个Kubernetes Pod都有两个对应的目录。例如,在我的情况下,具有ID ad4348c2-ac53-4c09-a9dc-c207a6c68dec 的Kubernetes Pod具有以下两个cgroup目录:

正确的cgroup可以使用以下命令找到(或通过检查Pod的JSON输出找到):

因此,正确的cgroup用于附加和检查套接字消息的Pod将是 kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:3a4af0e09c0e7e506fef59b92cbeb008b0a3e66d442e54e5ca5ded642841a335

英文:

It seems the issue was in the directory name. On my system, each kubernetes pod had two corresponding directories. For example, in my case, the kubernetes pod with ID ad4348c2-ac53-4c09-a9dc-c207a6c68dec had the following two cgroup directories:

  1. $ ls | grep ad4348c2_ac53_4c09_a9dc_c207a6c68dec
  2. kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:30a47e8e847277317a29ff7bdcf5bf03391ff79b847be647120d285f62a0f7e6
  3. kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:3a4af0e09c0e7e506fef59b92cbeb008b0a3e66d442e54e5ca5ded642841a335

The correct cgroup can be found using following command (or by inspecting the pod's json output):

  1. $ kubectl get pods -A -o custom-columns=PodName:.metadata.name,PodUID:.metadata.uid,ContainerID:.status.containerStatuses[0].containerID
  2. PodName PodUID ContainerID
  3. frontend-b74f77687-sd8rf ad4348c2-ac53-4c09-a9dc-c207a6c68dec containerd://3a4af0e09c0e7e506fef59b92cbeb008b0a3e66d442e54e5ca5ded642841a335

Hence, the correct cgroup for the pod to attach and inspect socket messages would be kubepods-burstable-podad4348c2_ac53_4c09_a9dc_c207a6c68dec.slice:cri-containerd:3a4af0e09c0e7e506fef59b92cbeb008b0a3e66d442e54e5ca5ded642841a335

huangapple
  • 本文由 发表于 2023年5月28日 14:22:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76350203.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定