Re: [PATCH v2 3/7] mm: Introduce FAULT_FLAG_INTERRUPTIBLE

From: David Hildenbrand
Date: Fri Sep 06 2019 - 08:53:23 EST


On 06.09.19 14:38, Peter Xu wrote:
> On Fri, Sep 06, 2019 at 11:02:22AM +0200, David Hildenbrand wrote:
>> On 05.09.19 12:15, Peter Xu wrote:
>>> handle_userfaultfd() is currently the only one place in the kernel
>>> page fault procedures that can respond to non-fatal userspace signals.
>>> It was trying to detect such an allowance by checking against USER &
>>> KILLABLE flags, which was "un-official".
>>>
>>> In this patch, we introduced a new flag (FAULT_FLAG_INTERRUPTIBLE) to
>>> show that the fault handler allows the fault procedure to respond even
>>> to non-fatal signals. Meanwhile, add this new flag to the default
>>> fault flags so that all the page fault handlers can benefit from the
>>> new flag. With that, replacing the userfault check to this one.
>>>
>>> Since the line is getting even longer, clean up the fault flags a bit
>>> too to ease TTY users.
>>>
>>> Although we've got a new flag and applied it, we shouldn't have any
>>> functional change with this patch so far.
>>>
>>> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>>> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
>>> ---
>>> fs/userfaultfd.c | 4 +---
>>> include/linux/mm.h | 39 ++++++++++++++++++++++++++++-----------
>>> 2 files changed, 29 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
>>> index ccbdbd62f0d8..4a8ad2dc2b6f 100644
>>> --- a/fs/userfaultfd.c
>>> +++ b/fs/userfaultfd.c
>>> @@ -462,9 +462,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
>>> uwq.ctx = ctx;
>>> uwq.waken = false;
>>>
>>> - return_to_userland =
>>> - (vmf->flags & (FAULT_FLAG_USER|FAULT_FLAG_KILLABLE)) ==
>>> - (FAULT_FLAG_USER|FAULT_FLAG_KILLABLE);
>>> + return_to_userland = vmf->flags & FAULT_FLAG_INTERRUPTIBLE;
>>> blocking_state = return_to_userland ? TASK_INTERRUPTIBLE :
>>> TASK_KILLABLE;
>>>
>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>> index 57fb5c535f8e..53ec7abb8472 100644
>>> --- a/include/linux/mm.h
>>> +++ b/include/linux/mm.h
>>> @@ -383,22 +383,38 @@ extern unsigned int kobjsize(const void *objp);
>>> */
>>> extern pgprot_t protection_map[16];
>>>
>>> -#define FAULT_FLAG_WRITE 0x01 /* Fault was a write access */
>>> -#define FAULT_FLAG_MKWRITE 0x02 /* Fault was mkwrite of existing pte */
>>> -#define FAULT_FLAG_ALLOW_RETRY 0x04 /* Retry fault if blocking */
>>> -#define FAULT_FLAG_RETRY_NOWAIT 0x08 /* Don't drop mmap_sem and wait when retrying */
>>> -#define FAULT_FLAG_KILLABLE 0x10 /* The fault task is in SIGKILL killable region */
>>> -#define FAULT_FLAG_TRIED 0x20 /* Second try */
>>> -#define FAULT_FLAG_USER 0x40 /* The fault originated in userspace */
>>> -#define FAULT_FLAG_REMOTE 0x80 /* faulting for non current tsk/mm */
>>> -#define FAULT_FLAG_INSTRUCTION 0x100 /* The fault was during an instruction fetch */
>>> +/**
>>> + * Fault flag definitions.
>>> + *
>>> + * @FAULT_FLAG_WRITE: Fault was a write fault.
>>> + * @FAULT_FLAG_MKWRITE: Fault was mkwrite of existing PTE.
>>> + * @FAULT_FLAG_ALLOW_RETRY: Allow to retry the fault if blocked.
>>> + * @FAULT_FLAG_RETRY_NOWAIT: Don't drop mmap_sem and wait when retrying.
>>> + * @FAULT_FLAG_KILLABLE: The fault task is in SIGKILL killable region.
>>> + * @FAULT_FLAG_TRIED: The fault has been tried once.
>>> + * @FAULT_FLAG_USER: The fault originated in userspace.
>>> + * @FAULT_FLAG_REMOTE: The fault is not for current task/mm.
>>> + * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch.
>>> + * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals.
>>> + */
>>> +#define FAULT_FLAG_WRITE 0x01
>>> +#define FAULT_FLAG_MKWRITE 0x02
>>> +#define FAULT_FLAG_ALLOW_RETRY 0x04
>>> +#define FAULT_FLAG_RETRY_NOWAIT 0x08
>>> +#define FAULT_FLAG_KILLABLE 0x10
>>> +#define FAULT_FLAG_TRIED 0x20
>>> +#define FAULT_FLAG_USER 0x40
>>> +#define FAULT_FLAG_REMOTE 0x80
>>> +#define FAULT_FLAG_INSTRUCTION 0x100
>>> +#define FAULT_FLAG_INTERRUPTIBLE 0x200
>>>
>>
>> I'd probably split off the unrelated doc changes. Just a matter of taste.
>
> The thing is that it's not really a document change but only a format
> change (when I wanted to add the new macro it's easily getting out of
> 80 chars so I simply reformatted all the rest to make them look
> similar). I'm afraid that could be too trivial to change the format
> as a single patch, but I can do it if anyone else also thinks it
> proper.
>
>>
>>> /*
>>> * The default fault flags that should be used by most of the
>>> * arch-specific page fault handlers.
>>> */
>>> #define FAULT_FLAG_DEFAULT (FAULT_FLAG_ALLOW_RETRY | \
>>> - FAULT_FLAG_KILLABLE)
>>> + FAULT_FLAG_KILLABLE | \
>>> + FAULT_FLAG_INTERRUPTIBLE)
>>
>> So by default, all faults are marked interruptible, also
>> !FAULT_FLAG_USER. I assume the trick right now is that
>> handle_userfault() will indeed only be called on user faults and the
>> flag is used nowhere else ;)
>
> Sorry if this is confusing, but FAULT_FLAG_DEFAULT is just a macro to
> make the patchset easier so we define this initial flags for most of
> the archs (say, there can be some arch that does not use this default
> value, but the fact is most archs are indeed using the same flags
> hence we define it here now).
>
> And, userfaultfd can also handle kernel faults. For FAULT_FLAG_USER,
> it will be set if the fault comes from userspace (in
> do_user_addr_fault()).
>
Got it, sounds sane to me then.

>>
>> Would it make sense to name it FAULT_FLAG_USER_INTERRUPTIBLE, to stress
>> that the flag only applies to user faults? (or am I missing something
>> and this could also apply to !user faults somewhen in the future?
>
> As mentioned above, uffd can handle kernel faults. And, for what I
> understand, it's not really directly related to user fault or not at
> all, instead its more or less match with TASK_{INTERRUPTIBLE|KILLABLE}
> on what kind of signals we care about during the fault processing. So
> it seems to me that it's two different things.
>
>>
>> (I am no expert on the fault paths yet, so sorry for the silly questions)
>
> (I only hope that I'm not providing silly answers. :)

I highly doubt it :) Thanks!

--

Thanks,

David / dhildenb