Re: [PATCH v8 00/17] gfs2: Fix mmap + page fault deadlocks

From: Catalin Marinas
Date: Thu Oct 28 2021 - 17:40:34 EST


On Thu, Oct 28, 2021 at 10:20:52PM +0100, Catalin Marinas wrote:
> On Wed, Oct 27, 2021 at 02:14:48PM -0700, Linus Torvalds wrote:
> > On Wed, Oct 27, 2021 at 12:13 PM Catalin Marinas
> > <catalin.marinas@xxxxxxx> wrote:
> > > As an alternative, you mentioned earlier that a per-thread fault status
> > > was not feasible on x86 due to races. Was this only for the hw poison
> > > case? I think the uaccess is slightly different.
> >
> > It's not x86-specific, it's very generic.
> >
> > If we set some flag in the per-thread status, we'll need to be careful
> > about not overwriting it if we then have a subsequent NMI that _also_
> > takes a (completely unrelated) page fault - before we then read the
> > per-thread flag.
> >
> > Think 'perf' and fetching backtraces etc.
> >
> > Note that the NMI page fault can easily also be a pointer coloring
> > fault on arm64, for exactly the same reason that whatever original
> > copy_from_user() code was. So this is not a "oh, pointer coloring
> > faults are different". They have the same re-entrancy issue.
> >
> > And both the "pagefault_disable" and "fault happens in interrupt
> > context" cases are also the exact same 'faulthandler_disabled()'
> > thing. So even at fault time they look very similar.
>
> They do look fairly similar but we should have the information in the
> fault handler to distinguish: not a page fault (pte permission or p*d
> translation), in_task(), user address, fixup handler. But I agree the
> logic looks fragile.
>
> I think for nested contexts we can save the uaccess fault state on
> exception entry, restore it on return. Or (needs some thinking on
> atomicity) save it in a local variable. The high-level API would look
> something like:
>
> unsigned long uaccess_flags; /* we could use TIF_ flags */
>
> uaccess_flags = begin_retriable_uaccess();
> copied = copy_page_from_iter_atomic(...);
> retry = end_retriable_uaccess(uaccess_flags);

It doesn't work with local flags, so it would need to be saved on
exception entry (interrupt, breakpoint etc.) on the stack, restore on
return. But the API would return pretty close (and probably still more
complicated than copy_*() returning an error code).

--
Catalin