Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks
From: Linus Torvalds
Date: Tue Oct 26 2021 - 14:50:29 EST
On Tue, Oct 26, 2021 at 11:24 AM Catalin Marinas
<catalin.marinas@xxxxxxx> wrote:
>
> While more intrusive, I'd rather change copy_page_from_iter_atomic()
> etc. to take a pointer where to write back an error code.
I absolutely hate this model.
The thing is, going down that rat-hole, you'll find that you'll need
to add it to *all* the "copy_to/from_user()" cases, which isn't
acceptable. So then you start doing some duplicate versions with
different calling conventions, just because of things like this.
So no, I really don't want a "pass down a reference to an extra error
code" kind of horror.
That said, the fact that these sub-page faults are always
non-recoverable might be a hint to a solution to the problem: maybe we
could extend the existing return code with actual negative error
numbers.
Because for _most_ cases of "copy_to/from_user()" and friends by far,
the only thing we look for is "zero for success".
We could extend the "number of bytes _not_ copied" semantics to say
"negative means fatal", and because there are fairly few places that
actually look at non-zero values, we could have a coccinelle script
that actually marks those places.
End result: no change in calling conventions, no change to most users,
and the (relatively few) cases where we look at the "what about
partial results", we just add a
.. existing code ..
ret = copy_from_user(..);
+ if (ret < 0)
+ break; // or whatever "fatal error" situation
.. existing code ..
kind of thing that just stops the re-try.
(The coccinelle script couldn't actually do that, but it could add
some comment marker or something so that it's easy to find and then
manually fix up the places it finds).
Linus