Re: [PATCH v2 2/2] bpf/selftests: Add selftests for new task kfuncs
From: Kumar Kartikeya Dwivedi
Date: Mon Oct 03 2022 - 17:03:48 EST
On Mon, 3 Oct 2022 at 21:53, Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> On 10/3/22 8:56 AM, Kumar Kartikeya Dwivedi wrote:
> >>> Also, could you include a test to make sure sleepable programs cannot
> >>> call bpf_task_acquire? It seems to assume RCU read lock is held while
> >>> that may not be true. If already not possible, maybe a WARN_ON_ONCE
> >>> inside the helper to ensure future cases don't creep in.
> >> I don't _think_ it's unsafe for a sleepable program to call
> >> bpf_task_acquire(). My understanding is that the struct task_struct *
> >> parameter to bpf_task_acquire() is not PTR_UNTRUSTED, so it's safe to
> >> dereference directly in the kfunc. The implicit assumption here is that
> >> the task was either passed to the BPF program (which is calling
> >> bpf_task_acquire()) from the main kernel in something like a trace or
> >> struct_ops callback, or it was a referenced kptr that was removed from a
> >> map with bpf_kptr_xchg(), and is now owned by the BPF program. Given
> >> that the ptr type is not PTR_UNTRUSTED, it seemed correct to assume that
> >> the task was valid in bpf_task_acquire() regardless of whether we were
> >> in an RCU read region or not, but please let me know if I'm wrong about
> > I don't think it's correct. You can just walk arbitrary structures and
> > obtain a normal PTR_TO_BTF_ID that looks seemingly ok to the verifier
> > but has no lifetime guarantees. It's a separate pre-existing problem
> > unrelated to your series . PTR_UNTRUSTED is not set for those cases
> > currently.
> > So the argument to your bpf_task_acquire may already be freed by then.
> > This issue would be exacerbated in sleepable BPF programs, where RCU
> > read lock is not held, so some cases of pointer walking where it may
> > be safe no longer holds.
> > I am planning to clean this up, but I'd still prefer if we don't allow
> > this helper in sleepable programs, yet. kptr_get is ok to allow.
> > Once you have explicit BPF RCU read sections, sleepable programs can
> > take it, do loads, and operate on the RCU pointer safely until they
> > invalidate it with the outermost bpf_rcu_read_unlock. It's needed for
> > local kptrs as well, not just this. I plan to post this very soon, so
> > we should be able to fix it up in the current cycle after landing your
> > series.
> > __rcu tags in the kernel will automatically be understood by the
> > verifier and for the majority of the correctly annotated cases this
> > will work fine and not break tracing programs.
> > : https://lore.kernel.org/bpf/CAADnVQJxe1QT5bvcsrZQCLeZ6kei6WEESP5bDXf_5qcB2Bb6_Q@xxxxxxxxxxxxxx
> >> that. Other kfuncs I saw such as bpf_xdp_ct_lookup() assumed that the
> >> parameter passed by the BPF program (which itself was passing on a
> >> pointer given to it by the main kernel) is valid as well.
> > Yeah, but the CT API doesn't assume validity of random PTR_TO_BTF_ID,
> > it requires KF_TRUSTED_ARGS which forces them to have ref_obj_id != 0.
> Other than ref_obj_id != 0, could the PTR_TO_BTF_ID obtained through
> btf_ctx_access be marked as trusted (e.g. the ctx in the selftest here)
> and bpf_task_acquire() will require KF_TRUSTED_ARGS? would it be safe in general?
Recently eed807f62610 ("bpf: Tweak definition of KF_TRUSTED_ARGS")
relaxed things a bit, now that constraint is only for PTR_TO_BTF_ID
and it allows other types without ref_obj_id > 0.
But you're right, ctx loads should usually be trusted, this is the
current plan (also elaborated a bit in that link ), usually that is
true, an exception is free path as you noted in your reply for patch 1
(and especially fexit path where object may not even be alive
anymore). There are some details to work out, but eventually I'd want
to force KF_TRUSTED_ARGS by default for all kfuncs, and we only make
exceptions in some special cases by having some KF_UNSAFE flag or
__unsafe tag for arguments. It's becoming harder to think about all
permutations if we're too permissive by default in terms of acceptable