Re: [RFC][PATCH 5/8] x86/mm: fix exception table comments

From: Sean Christopherson
Date: Fri Sep 07 2018 - 17:04:54 EST


On Fri, 2018-09-07 at 12:49 -0700, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>
> The comments here are wrong.ÂÂThey are too absolute about where
> faults can occur when running in the kernel.ÂÂThe comments are
> also a bit hard to match up with the code.
>
> Trim down the comments, and make them more precise.
>
> Also add a comment explaining why we are doing the
> bad_area_nosemaphore() path here.
>
> Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> ---
>
> Âb/arch/x86/mm/fault.c |ÂÂÂ27 ++++++++++++++-------------
> Â1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff -puN arch/x86/mm/fault.c~pkeys-fault-warnings-03 arch/x86/mm/fault.c
> --- a/arch/x86/mm/fault.c~pkeys-fault-warnings-03 2018-09-07 11:21:47.696751898 -0700
> +++ b/arch/x86/mm/fault.c 2018-09-07 11:21:47.700751898 -0700
> @@ -1349,24 +1349,25 @@ void do_user_addr_space_fault(struct pt_
> Â flags |= FAULT_FLAG_INSTRUCTION;
> Â
> Â /*
> - Â* When running in the kernel we expect faults to occur only to
> - Â* addresses in user space.ÂÂAll other faults represent errors in
> - Â* the kernel and should generate an OOPS.ÂÂUnfortunately, in the
> - Â* case of an erroneous fault occurring in a code path which already
> - Â* holds mmap_sem we will deadlock attempting to validate the fault
> - Â* against the address space.ÂÂLuckily the kernel only validly
> - Â* references user space from well defined areas of code, which are
> - Â* listed in the exceptions table.
> + Â* Kernel-mode access to the user address space should only occur
> + Â* inside well-defined areas of code listed in the exception
> + Â* tables.ÂÂBut, an erroneous kernel fault occurring outside one of
> + Â* those areas which also holds mmap_sem might deadlock attempting
> + Â* to validate the fault against the address space.
> Â Â*
> - Â* As the vast majority of faults will be valid we will only perform
> - Â* the source reference check when there is a possibility of a
> - Â* deadlock. Attempt to lock the address space, if we cannot we then
> - Â* validate the source. If this is invalid we can skip the address
> - Â* space check, thus avoiding the deadlock:
> + Â* Only do the expensive exception table search when we might be at
> + Â* risk of a deadlock:
> + Â* 1. We failed to acquire mmap_sem, and
> + Â* 2. The access was an explicit kernel-mode access
> + Â*ÂÂÂÂ(X86_PF_USER=0).

Might be worth reminding the reader that X86_PF_USER will be set in
sw_error_code for implicit accesses. ÂI saw "explicit" and my mind
immediately jumped to hw_error_code for whatever reason. ÂE.g.:

* 2. The access was an explicit kernel-mode access (we set X86_PF_USER
* Â Âin sw_error_code for implicit kernel-mode accesses).

> Â Â*/
> Â if (unlikely(!down_read_trylock(&mm->mmap_sem))) {
> Â if (!(sw_error_code & X86_PF_USER) &&
> Â ÂÂÂÂ!search_exception_tables(regs->ip)) {
> + /*
> + Â* Fault from code in kernel from
> + Â* which we do not expect faults.
> + Â*/
> Â bad_area_nosemaphore(regs, sw_error_code, address, NULL);
> Â return;
> Â }
> _