Re: [PATCH v7 05/19] iov_iter: Introduce fault_in_iov_iter_writeable

From: Al Viro
Date: Sat Aug 28 2021 - 21:25:40 EST


On Fri, Aug 27, 2021 at 09:48:55PM +0000, Al Viro wrote:

> So we have 3 callers where we want all-or-nothing semantics - two in
> arch/x86/kernel/fpu/signal.c and one in btrfs. HWPOISON will be a problem
> for all 3, AFAICS...
>
> IOW, it looks like we have two different things mixed here - one that wants
> to try and fault stuff in, with callers caring only about having _something_
> faulted in (most of the users) and one that wants to make sure we *can* do
> stores or loads on each byte in the affected area.
>
> Just accessing a byte in each page really won't suffice for the second kind.
> Neither will g-u-p use, unless we teach it about HWPOISON and other fun
> beasts... Looks like we want that thing to be a separate primitive; for
> btrfs I'd probably replace fault_in_pages_writeable() with clear_user()
> as a quick fix for now...

Looks like out of these 3 we have
* x86 restoring FPU state on sigreturn: correct, if somewhat obfuscated;
HWPOISON is not an issue. We want full fault-in there (1 or 2 pages)
* x86 saving FPU state into sigframe: not really needed; we do
__clear_user() on any error anyway, and taking it into the caller past the
pagefault_enable() will serve just fine instead of fault-in of the same
for write.
* btrfs search_ioctl(): HWPOISON is not an issue (no #MC on stores),
but arm64 side of the things very likely is a problem with MTE; there we
can have successful store in some bytes in a page with faults on stores
elsewhere in it. With such setups that thing will loop indefinitely.
And unlike x86 FPU handling, btrfs is arch-independent.

IOW, unless I'm misreading the situation, we have one caller where "all or
nothing" semantics is correct and needed, several where fault-in is pointless,
one where the current use of fault-in is actively wrong (ppc kvm, patch from
ppc folks exists), another place where neither semantics is right (btrfs on
arm64) and a bunch where "can we access at least the first byte?" should be
fine...