Re: [PATCH bpf-next v5 0/5] bpf: Add open-coded style process file iterator and bpf_fget_task() kfunc
From: Christian Brauner
Date: Tue Dec 10 2024 - 09:44:59 EST
On Tue, Dec 10, 2024 at 02:01:53PM +0000, Juntong Deng wrote:
> This patch series adds open-coded style process file iterator
> bpf_iter_task_file and bpf_fget_task() kfunc, and corresponding
> selftests test cases.
>
> In addition, since fs kfuncs is generic and useful for scenarios
> other than LSM, this patch makes fs kfuncs available for SYSCALL
> and TRACING program types [0].
>
> [0]: https://lore.kernel.org/bpf/CAPhsuW6ud21v2xz8iSXf=CiDL+R_zpQ+p8isSTMTw=EiJQtRSw@xxxxxxxxxxxxxx/
>
> Although iter/task_file already exists, for CRIB we still need the
What is CRIB?
> open-coded iterator style process file iterator, and the same is true
> for other bpf iterators such as iter/tcp, iter/udp, etc.
>
> The traditional bpf iterator is more like a bpf version of procfs, but
> similar to procfs, it is not suitable for CRIB scenarios that need to
> obtain large amounts of complex, multi-level in-kernel information.
>
> The following is from previous discussions [2].
>
> [2]: https://lore.kernel.org/bpf/AM6PR03MB5848CA34B5B68C90F210285E99B12@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> This is because the context of bpf iterators is fixed and bpf iterators
> cannot be nested. This means that a bpf iterator program can only
> complete a specific small iterative dump task, and cannot dump
> multi-level data.
>
> An example, when we need to dump all the sockets of a process, we need
> to iterate over all the files (sockets) of the process, and iterate over
> the all packets in the queue of each socket, and iterate over all data
> in each packet.
>
> If we use bpf iterator, since the iterator can not be nested, we need to
> use socket iterator program to get all the basic information of all
> sockets (pass pid as filter), and then use packet iterator program to
> get the basic information of all packets of a specific socket (pass pid,
> fd as filter), and then use packet data iterator program to get all the
> data of a specific packet (pass pid, fd, packet index as filter).
>
> This would be complicated and require a lot of (each iteration)
> bpf program startup and exit (leading to poor performance).
>
> By comparison, open coded iterator is much more flexible, we can iterate
> in any context, at any time, and iteration can be nested, so we can
> achieve more flexible and more elegant dumping through open coded
> iterators.
>
> With open coded iterators, all of the above can be done in a single
> bpf program, and with nested iterators, everything becomes compact
> and simple.
>
> Also, bpf iterators transmit data to user space through seq_file,
> which involves a lot of open (bpf_iter_create), read, close syscalls,
> context switching, memory copying, and cannot achieve the performance
> of using ringbuf.
>
> Signed-off-by: Juntong Deng <juntong.deng@xxxxxxxxxxx>
> ---
> v4 -> v5:
> * Add file type checks in test cases for process file iterator
> and bpf_fget_task().
>
> * Use fentry to synchronize tests instead of waiting in a loop.
>
> * Remove path_d_path_kfunc_non_lsm test case.
>
> * Replace task_lookup_next_fdget_rcu() with fget_task_next().
>
> * Remove future merge conflict section in cover letter (resolved).
>
> v3 -> v4:
> * Make all kfuncs generic, not CRIB specific.
>
> * Move bpf_fget_task to fs/bpf_fs_kfuncs.c.
>
> * Remove bpf_iter_task_file_get_fd and bpf_get_file_ops_type.
>
> * Use struct bpf_iter_task_file_item * as the return value of
> bpf_iter_task_file_next.
>
> * Change fd to unsigned int type and add next_fd.
>
> * Add KF_RCU_PROTECTED to bpf_iter_task_file_new.
>
> * Make fs kfuncs available to SYSCALL and TRACING program types.
>
> * Update all relevant test cases.
>
> * Remove the discussion section from cover letter.
>
> v2 -> v3:
> * Move task_file open-coded iterator to kernel/bpf/helpers.c.
>
> * Fix duplicate error code 7 in test_bpf_iter_task_file().
>
> * Add comment for case when bpf_iter_task_file_get_fd() returns -1.
>
> * Add future plans in commit message of "Add struct file related
> CRIB kfuncs".
>
> * Add Discussion section to cover letter.
>
> v1 -> v2:
> * Fix a type definition error in the fd parameter of
> bpf_fget_task() at crib_common.h.
>
> Juntong Deng (5):
> bpf: Introduce task_file open-coded iterator kfuncs
> selftests/bpf: Add tests for open-coded style process file iterator
> bpf: Add bpf_fget_task() kfunc
> bpf: Make fs kfuncs available for SYSCALL and TRACING program types
> selftests/bpf: Add tests for bpf_fget_task() kfunc
>
> fs/bpf_fs_kfuncs.c | 42 ++++---
> kernel/bpf/helpers.c | 3 +
> kernel/bpf/task_iter.c | 92 ++++++++++++++
> .../testing/selftests/bpf/bpf_experimental.h | 15 +++
> .../selftests/bpf/prog_tests/fs_kfuncs.c | 46 +++++++
> .../testing/selftests/bpf/prog_tests/iters.c | 79 ++++++++++++
> .../selftests/bpf/progs/fs_kfuncs_failure.c | 33 +++++
> .../selftests/bpf/progs/iters_task_file.c | 88 ++++++++++++++
> .../bpf/progs/iters_task_file_failure.c | 114 ++++++++++++++++++
> .../selftests/bpf/progs/test_fget_task.c | 63 ++++++++++
> .../selftests/bpf/progs/verifier_vfs_reject.c | 10 --
> 11 files changed, 559 insertions(+), 26 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/progs/fs_kfuncs_failure.c
> create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file.c
> create mode 100644 tools/testing/selftests/bpf/progs/iters_task_file_failure.c
> create mode 100644 tools/testing/selftests/bpf/progs/test_fget_task.c
>
> --
> 2.39.5
>