Re: [RFC PATCH bpf-next 1/4] bpf: Introduce css_task open-coded iterator kfuncs

From: Alexei Starovoitov
Date: Tue Sep 05 2023 - 15:04:12 EST


On Sun, Aug 27, 2023 at 12:21 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:
>
> This Patch adds kfuncs bpf_iter_css_task_{new,next,destroy} which allow
> creation and manipulation of struct bpf_iter_css_task in open-coded
> iterator style. These kfuncs actually wrapps
> css_task_iter_{start,next,end}. BPF programs can use these kfuncs through
> bpf_for_each macro for iteration of all tasks under a css.
>
> Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
> ---
> include/uapi/linux/bpf.h | 4 ++++
> kernel/bpf/helpers.c | 3 +++
> kernel/bpf/task_iter.c | 39 ++++++++++++++++++++++++++++++++++
> tools/include/uapi/linux/bpf.h | 4 ++++
> tools/lib/bpf/bpf_helpers.h | 7 ++++++
> 5 files changed, 57 insertions(+)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 60a9d59beeab..2a6e9b99564b 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -7195,4 +7195,8 @@ struct bpf_iter_num {
> __u64 __opaque[1];
> } __attribute__((aligned(8)));
>
> +struct bpf_iter_css_task {
> + __u64 __opaque[1];
> +} __attribute__((aligned(8)));
> +
> #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 9e80efa59a5d..cf113ad24837 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2455,6 +2455,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_slice_rdwr, KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_iter_num_new, KF_ITER_NEW)
> BTF_ID_FLAGS(func, bpf_iter_num_next, KF_ITER_NEXT | KF_RET_NULL)
> BTF_ID_FLAGS(func, bpf_iter_num_destroy, KF_ITER_DESTROY)
> +BTF_ID_FLAGS(func, bpf_iter_css_task_new, KF_ITER_NEW)
> +BTF_ID_FLAGS(func, bpf_iter_css_task_next, KF_ITER_NEXT | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_iter_css_task_destroy, KF_ITER_DESTROY)
> BTF_ID_FLAGS(func, bpf_dynptr_adjust)
> BTF_ID_FLAGS(func, bpf_dynptr_is_null)
> BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index c4ab9d6cdbe9..b1bdba40b684 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -823,6 +823,45 @@ const struct bpf_func_proto bpf_find_vma_proto = {
> .arg5_type = ARG_ANYTHING,
> };
>
> +struct bpf_iter_css_task_kern {
> + struct css_task_iter *css_it;
> +} __attribute__((aligned(8)));
> +
> +__bpf_kfunc int bpf_iter_css_task_new(struct bpf_iter_css_task *it,
> + struct cgroup_subsys_state *css, unsigned int flags)
> +{
> + struct bpf_iter_css_task_kern *kit = (void *)it;
> +
> + BUILD_BUG_ON(sizeof(struct bpf_iter_css_task_kern) != sizeof(struct bpf_iter_css_task));
> + BUILD_BUG_ON(__alignof__(struct bpf_iter_css_task_kern) !=
> + __alignof__(struct bpf_iter_css_task));
> +
> + kit->css_it = kzalloc(sizeof(struct css_task_iter), GFP_KERNEL);
> + if (!kit->css_it)
> + return -ENOMEM;
> + css_task_iter_start(css, flags, kit->css_it);

Some of the flags are internal. Like CSS_TASK_ITER_SKIPPED.
The kfunc should probably only allow CSS_TASK_ITER_PROCS |
CSS_TASK_ITER_THREADED,
and not CSS_TASK_ITER_THREADED alone.

Since they're #define-s it's not easy for bpf prog to use them.
I think would be good to have a pre-patch that converts them to enum,
so that bpf prog can take them from vmlinux.h.


But the main issue of the patch that it adds this iter to common kfuncs.
That's not safe, since css_task_iter_*() does spin_unlock_irq() which
might screw up irq flags depending on the context where bpf prog is running.
Can css_task_iter internals switch to irqsave/irqrestore?
css_set_lock is also global, so the bpf side has to be careful in
where it allows to use this iter.
bpf_lsm hooks are safe, most of bpf iter-s are safe too.
Future bpf-oom hooks are probably safe as well.
We probably need an allowlist here.