Re: [External] Re: [RFC PATCH] x86/fault: move might_sleep() out of mmap read lock

From: Muchun Song
Date: Fri Nov 26 2021 - 00:57:07 EST


On Thu, Nov 25, 2021 at 6:45 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Muchun, Dave!
>
> On Mon, Nov 22 2021 at 14:59, Muchun Song wrote:
> > On Fri, Nov 19, 2021 at 11:04 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >> >
> >> > + might_sleep();
> >> > +
> >> > /*
> >> > * Kernel-mode access to the user address space should only occur
> >> > * on well-defined single instructions listed in the exception
> >> > @@ -1346,13 +1348,6 @@ void do_user_addr_fault(struct pt_regs *regs,
> >> > }
> >> > retry:
> >> > mmap_read_lock(mm);
> >> > - } else {
> >> > - /*
> >> > - * The above down_read_trylock() might have succeeded in
> >> > - * which case we'll have missed the might_sleep() from
> >> > - * down_read():
> >> > - */
> >> > - might_sleep();
> >> > }
> >> >
> >> > vma = find_vma(mm, address);
> >>
> >> The comment is stale, which isn't great. The might_sleep() is already
> >> in the fast path. So, moving it up above makes a lot of sense just in
> >> terms of simplicity.
>
> I don't think so. The point is:
>
> if (unlikely(!mmap_read_trylock(mm))) {
> if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
> /*
> * Fault from code in kernel from
> * which we do not expect faults.
> */
> bad_area_nosemaphore(regs, error_code, address);
> return;
> }
>
> Moving it up will make the might_sleep() splat more important than an
> unexpected fault when the unexpected fault happens in e.g. a preemption
> disabled region. That's wrong because the important information in this
> case is not the might sleep splat. The important information is the
> fault itself.
>
> But moving it up is even more wrong for spurious faults which are
> correctly handled in that case via:
>
> bad_area_nosemaphore()
> __bad_area_nosemaphore()
> kernelmode_fixup_or_oops()
> handle(AMD erratum #91)
> is_prefetch()
>
> So if such a spurious fault happens in a condition which would trigger
> the might_sleep() splat then moving might_sleep() before the trylock()
> will cause false positives. So, no. It's going to stay where it is.

Got it. I didn't realize those cases. Thank you for your patient
explanation.


>
> > Without this patch, I didn't see the might_sleep() in the fast path. What
> > am I missing here?
>
> I have no idea what you are doing. If the trylock() succeeds and the
> fault happened in e.g. a preemption disabled region then the
> might_sleep() in the else path will trigger no matter what.
>
> Thanks,
>
> tglx