Re: [RFC PATCH] x86/fault: move might_sleep() out of mmap read lock

From: Thomas Gleixner
Date: Thu Nov 25 2021 - 05:49:46 EST


Muchun, Dave!

On Mon, Nov 22 2021 at 14:59, Muchun Song wrote:
> On Fri, Nov 19, 2021 at 11:04 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> >
>> > + might_sleep();
>> > +
>> > /*
>> > * Kernel-mode access to the user address space should only occur
>> > * on well-defined single instructions listed in the exception
>> > @@ -1346,13 +1348,6 @@ void do_user_addr_fault(struct pt_regs *regs,
>> > }
>> > retry:
>> > mmap_read_lock(mm);
>> > - } else {
>> > - /*
>> > - * The above down_read_trylock() might have succeeded in
>> > - * which case we'll have missed the might_sleep() from
>> > - * down_read():
>> > - */
>> > - might_sleep();
>> > }
>> >
>> > vma = find_vma(mm, address);
>>
>> The comment is stale, which isn't great. The might_sleep() is already
>> in the fast path. So, moving it up above makes a lot of sense just in
>> terms of simplicity.

I don't think so. The point is:

if (unlikely(!mmap_read_trylock(mm))) {
if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
/*
* Fault from code in kernel from
* which we do not expect faults.
*/
bad_area_nosemaphore(regs, error_code, address);
return;
}

Moving it up will make the might_sleep() splat more important than an
unexpected fault when the unexpected fault happens in e.g. a preemption
disabled region. That's wrong because the important information in this
case is not the might sleep splat. The important information is the
fault itself.

But moving it up is even more wrong for spurious faults which are
correctly handled in that case via:

bad_area_nosemaphore()
__bad_area_nosemaphore()
kernelmode_fixup_or_oops()
handle(AMD erratum #91)
is_prefetch()

So if such a spurious fault happens in a condition which would trigger
the might_sleep() splat then moving might_sleep() before the trylock()
will cause false positives. So, no. It's going to stay where it is.

> Without this patch, I didn't see the might_sleep() in the fast path. What
> am I missing here?

I have no idea what you are doing. If the trylock() succeeds and the
fault happened in e.g. a preemption disabled region then the
might_sleep() in the else path will trigger no matter what.

Thanks,

tglx