Re: [PATCH bpf v5 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

From: Linus Torvalds
Date: Fri Nov 13 2020 - 15:11:22 EST


On Fri, Nov 13, 2020 at 11:17 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> The v4 approach preserves performance. It wasn't switching to byte_at_a_time:

That v4 looks better, but still pointless.

But it might be acceptable, with that final

*out = (*out & ~mask) | (c & mask);

just replaced with

*out = c & mask;

which still writes past the end, but now it only writes zeroes.

But the only reason for that to be done is if you have exposed the
destination buffer to another thread before (and you zeroed it before
exposing it), and you want to make sure that any concurrent reader can
never see anything past the end of the string.

Again - the *only* case that could possibly matter is when you
pre-zeroed the buffer, because if you didn't, then a concurrent reader
would see random garbage *anyway*, particularly since there is no SMP
memory ordering imposed with the strncpy. So nothing but "pre-zeroed"
makes any possible sense, which means that the whole "(*out & ~mask)"
in that v4 patch is completely and utterly meaningless. There's
absolutely zero reason to try to preserve any old data.

In other words, you have two cases:

(a) no threaded and unlocked accesses to the resulting string

(b) you _do_ have concurrent threaded accesses to the string and no
locking (really? That's seriously questionable),

If you have case (a), then the only correct thing to do is to
explicitly pad afterwards. It's optimal, and doesn't make any
assumptions about implementation of strncpy_from_user().

If you really have that case (b), and you absolutely require that the
filling be done without exposing any temporary garbage, and thus the
"pad afterwards" doesn't work, then you are doing something seriously
odd.

But in that seriously odd (b) case, the _only_ possibly valid thing
you can do is to pre-zero the buffer, since strncpy_from_user()
doesn't even imply any memory ordering in its accesses, so any
concurrent reader by definition will see a randomly ordered partial
string being copied. That strikes me as completely insane, but at
least a careful reader could see a valid partial string being possibly
in the process of being built up. But again, that use-case is only
possible if the buffer is pre-zeroed, so doing that "(*out & ~mask)"
cannot be relevant or sane.

If you really do have that (b) case, then I'd accept that modified v4
patch, together with an absolutely *huge* comment both in
strncpy_from_user() and very much at the call-site, talking about that
non-locked concurrent access to the destination buffer.

Linus