Re: [PATCH bpf-next 1/2] bpf: align syscall writeback behavior with caller-declared size

From: Alexei Starovoitov

Date: Fri May 29 2026 - 16:40:19 EST

On Thu May 28, 2026 at 9:44 PM PDT, Yuyang Huang wrote:
> On Fri, May 29, 2026 at 9:03 AM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
>>
>> On Mon May 25, 2026 at 12:21 AM PDT, Yuyang Huang wrote:
>
> Thanks for the reply.
>
>> Ok. This is fair. 120933984460 is indeed problematic.
>
> I'm glad to hear we agree that 120933984460 is problematic.
>
>> Not really. Your patch adds checks to what looks like "random" copy_to_user()
>> places. This thread is clear indication that it's not a "robust architectural path".
>>
>> True robust path would be to wrap every copy_to_user() into another helper
>> that takes uattr start pointer and size, does extra check, and
>> returns something like ENOSPC (and not EFAULT),
>> but that would be a significant churn.
>
> I agree that the truly robust path would be a proper copy_to_user()
> wrapper instead of doing the ad-hoc check in the current path, but
> indeed, that will be a big change.
>
>> So I prefer a minimal patch that add single check in bpf_prog_query()
>> that checks that user space supplied buffer is large enough for attr->query.
>> If not -> EFAULT. That would be better than overwritting.
>> One can argue that partial copy_to_user() in __cgroup_bpf_query()
>> should still be allowed, but I'm against partial and
>> inconsistent queries.
>> query.revision might seem benign, but if we do it for all copy_to_user()
>> we better return ENOSPC to differentiate vs EFAULT to tell user space
>> to fix itself.
>
> Okay, I understand the preference is for a minimal patch to implement
> the fix. Just to make sure I fully understand your suggestion:
>
> Are you proposing that we add a single check at the entry gate in
> `bpf_prog_query()` like this?
>
> ```c
> static int bpf_prog_query(const union bpf_attr *attr,
> union bpf_attr __user *uattr, u32 uattr_size)
> {
> if (uattr_size < offsetofend(union bpf_attr, query.revision))
> return -EFAULT;
> ...
> }
> ```
>
> If we implement this, I want to confirm if you are okay with the
> consequence for legacy cgroup queries (e.g.,
> `BPF_CGROUP_INET_INGRESS`):
>
> 1. Before 120933984460: Legacy userspace compiled years ago with a
> 40-byte `bpf_attr` layout (ending at `prog_attach_flags`) regularly
> queries cgroups passing `size = 40`, and it worked perfectly.
> 2. With this check: These unmodified legacy applications will now fail
> with `-EFAULT` on newer kernels because they pass `size = 40` which is
> less than 64. This means all user-space applicaion is expected to
> "fix" their code when along with the kernel upgrade.

I bet you fixed your android setup long ago,
so all of these "but what about backward compat" is getting annoying.
Whatever kernel patch get merged android won't even backport, so enough of it.

> Is your preference to explicitly fail these legacy `size = 40` cgroup
> queries (breaking backward compatibility for them) to avoid "partial
> and inconsistent queries"?
>
> If we want to keep these legacy queries working safely (without
> failing them and without causing memory corruption), we cannot use a
> single check in `bpf_prog_query()`. We would still be forced to plumb
> `uattr_size` to `__cgroup_bpf_query()` so we can conditionally skip
> the `revision` writeback when `size < 64`.

Replied earlier that I don't like random checks like this through the code.