Re: [PATCH bpf-next v4 0/3] bpf: switch to new usercopy helpers

From: Alexei Starovoitov
Date: Wed Oct 16 2019 - 14:31:41 EST


On Wed, Oct 16, 2019 at 01:18:07PM +0200, Christian Brauner wrote:
> Hey everyone,
>
> This is v4. If you still feel that I should leave this code alone then
> simply ignore it. I won't send another version. Relevant tests pass and
> I've verified that other failures were already present without this
> patch series applied.

I'm looking at it the following way:
- v1 was posted with zero testing. Obviously broken patches.
- v[23] was claimed to be tested yet there were serious bugs.
Means you folks ran only the tests that I pointed out in v1.
- in v4 patch 3 now has imbalanced copy_to_user. Previously there was:
bpf_check_tail_zero+copy_from+copy_to. Now it's copy_struct_from_user+copy_to.
It's puzzling to read that code.
More so the patch removes actual_size > PAGE_SIZE check.
It's a change in behavior that commit log doesn't talk about.
- so even v4 is not ready to be merged.
- the copy_struct_from_user api was implemented by the same people who
sent buggy patches. When you guys came up with this 'generic' api
you didn't consider bpf usage and bpf_check_uarg_tail_zero() is still necessary.
- few places that were converted to copy_struct_from_user() still have
size > PAGE_SIZE. Why wasn't it part of generic?
It means that the api likely will be refactored again, but looking at the way
the patches were crafted I have no confidence that it will be thoroughly tested.
- and if I accept this set the future refactoring may break bpf side silently.
- what check_zeroed_user() is actually doing? imo it's a premature
optimization with complex implementation. Most of the time the user space passes
the size that is the same as kernel expects or smaller. Rarely user space
libs are newer than the kernel. In such case they should probe the kernel
once for new features (like libbpf does) and should not be calling kernel api
again and again to receive the same E2BIG again and again. So the fancy long read
optimization is used once in real life. Yet it's much more complex than
simple byte loop we do in bpf_check_uarg_tail_zero.
- so no, I'm not applying this. Instead I'm taking bets when this shiny new thing
will cause issues to other subsystems.