Re: [PATCH 15/31] mm/userfaultfd: allow pte_offset_map_lock() to fail

From: Peter Xu
Date: Fri May 26 2023 - 12:26:04 EST

On Thu, May 25, 2023 at 03:06:27PM -0700, Hugh Dickins wrote:
> On Wed, 24 May 2023, Peter Xu wrote:
> > On Sun, May 21, 2023 at 10:07:35PM -0700, Hugh Dickins wrote:
> > > mfill_atomic_install_pte() and mfill_atomic_pte_zeropage() treat
> > > failed pte_offset_map_lock() as -EFAULT, with no attempt to retry.
> >
> > Could you help explain why it should be -EFAULT, not -EAGAIN or -EEXIST?
> Thanks a lot for looking, Peter.
> No good justification for -EFAULT: I just grabbed the closest, fairly
> neutral, error code that I could see already being in use there: but now
> that you mention -EAGAIN, which I can see being used from mfill_atomic(),
> yes, that would be ideal - and consistent with how it's already being used.
> I'll make that change, thanks for suggesting. (And it had bugged me how
> my fs/userfaultfd.c was electing to retry, but this one electing to fail.)


> >
> > IIUC right now if pte existed we have -EEXIST returned as part of the
> > userfault ABI, no matter whether it's pte or thp.
> It might or might not correspond to -EEXIST - it might even end up as
> -EFAULT on a retry after -EAGAIN: I see mfill_atomic() contains both
> -EEXIST and -EFAULT cases for pmd_trans_huge(). Actually, I could
> say that the -EFAULT case there corresponds to the -EFAULT in this
> 15/31 patch, but that would be by coincidence not design: I'm happier
> with your -EAGAIN suggestion.

I had a feeling that that 2nd -EFAULT there could crash some userapp
already if it got returned somewhere, because the userapp shouldn't expect
that. IMHO it should also return -EAGAIN, or even -EEXIST because even if
user retries, we should highly possibly see that thp again, so the -EEXIST
should possibly follow anyway.

Not a big deal here I think - if an userapp can trigger that -EFAULT I'd
say it's also a user bug because it made two decisions already on resolving
page fault for single VA, and it's racy between them..

> >
> > IMHO it may boil down to my limited knowledge on how pte_offset_map_lock()
> > is used after this part 2 series, and I assume the core changes will be in
> > your 3rd series (besides this one and the arch one).
> >
> > Please shed some light if there's quick answers (IIUC this is for speeding
> > up collapsing shmem thps, but still no much clue here), or I can also wait
> > for reading the 3rd part if it'll come soon in any form.
> It wouldn't be particularly easy to deduce from the third series of
> patches, rather submerged in implementation details. Just keep in mind
> that, like in the "old" pmd_trans_unstable() cases, there may be instants
> at which, when trying to get the lock on a page table, that page table
> might already have gone, or been replaced by something else e.g. a THP,
> and a retry necessary at the outer level (if it's important to persist).

I'm actually still curious how the 3rd series will look like; would love to
read it when it comes.


Peter Xu