Re: [PATCH bpf-next v2 3/6] bpf: Introduce process open coded iterator kfuncs

From: Chuyi Zhou
Date: Fri Sep 15 2023 - 11:04:39 EST




在 2023/9/15 07:26, Andrii Nakryiko 写道:
On Tue, Sep 12, 2023 at 12:02 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:

This patch adds kfuncs bpf_iter_process_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_process in open-coded iterator
style. BPF programs can use these kfuncs or through bpf_for_each macro to
iterate all processes in the system.

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
include/uapi/linux/bpf.h | 4 ++++
kernel/bpf/helpers.c | 3 +++
kernel/bpf/task_iter.c | 29 +++++++++++++++++++++++++++++
tools/include/uapi/linux/bpf.h | 4 ++++
tools/lib/bpf/bpf_helpers.h | 5 +++++
5 files changed, 45 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index de02c0971428..befa55b52e29 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7322,4 +7322,8 @@ struct bpf_iter_css_task {
__u64 __opaque[1];
} __attribute__((aligned(8)));

+struct bpf_iter_process {
+ __u64 __opaque[1];
+} __attribute__((aligned(8)));
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index d6a16becfbb9..9b7d2c6f99d1 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2507,6 +2507,9 @@ BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW)
BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL)
BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY)
+BTF_ID_FLAGS(func, bpf_iter_process_new, KF_ITER_NEW)
+BTF_ID_FLAGS(func, bpf_iter_process_next, KF_ITER_NEXT | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_iter_process_destroy, KF_ITER_DESTROY)
BTF_ID_FLAGS(func, bpf_dynptr_adjust)
BTF_ID_FLAGS(func, bpf_dynptr_is_null)
BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index d8539cc05ffd..9d1927dc3a06 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -851,6 +851,35 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it)
kfree(kit->css_it);
}

+struct bpf_iter_process_kern {
+ struct task_struct *tsk;
+} __attribute__((aligned(8)));
+

Few high level thoughts. I think it would be good to follow
SEC("iter/task") naming and approach. Open-coded iterators in many
ways are in-kernel counterpart to iterator programs, so keeping them
close enough within reason is useful for knowledge transfer.

SEC("iter/task") allows to:
a) iterate all threads in the system
b) iterate all threads for a given TGID
c) it also allows to "iterate" a single thread or process, but that's
a bit less relevant for in-kernel iterator, but we can still support
them, why not?

I'm not sure if it supports iterating all processes (as in group
leaders of each task group) in the system, but if it's possible I
think we should support it at least for open-coded iterator, seems
like a very useful functionality.

So to that end, let's design a small set of input arguments for
bpf_iter_process_new() that would allow to specify this as flags +
either (optional) struct task_struct * pointer to represent
task/process or PID/TGID.


Another concern from Alexei was the readability of the API of open-coded in BPF Program[1].

bpf_for_each(task, curr) is straightforward. Users can easily understand that this API does the same thing as 'for_each_process' in kernel.

However, if we keep the approach of SEC("iter/task")

enum ITER_ITEM {
ITER_TASK,
ITER_THREAD,
}

__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_process *it, struct task_struct *group_task, enum ITER_ITEM type)

the API have to chang:


bpf_for_each(task, curr, NULL, ITERATE_TASK) // iterate all process in the system
bpf_for_each(task, curr, group_leader, ITERATE_THREAD) // iterate all thread of group_leader
bpf_for_each(task, curr, NULL, ITERATE_THREAD) //iterate all threads of all the process in the system

Useres may guess what are this API actually doing....

So, I'm thinking if we can add a layer of abstraction to hide the details from the users:

#define bpf_for_each_process(task) \
bpf_for_each(task, curr, NULL, ITERATE_TASK)


It would be nice if you could give me some better suggestions.

Thanks!

[1] https://lore.kernel.org/lkml/CAADnVQLbDWUxFen-RS67C86sOE5DykEPD8xyihJ2RnG1WEnTQg@xxxxxxxxxxxxxx/