Re: [RFC RESEND v2 02/13] powerpc: mm: Fix kfence page fault reporting
From: IBM
Date: Tue Oct 15 2024 - 04:22:48 EST
Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
> Le 15/10/2024 à 03:33, Ritesh Harjani (IBM) a écrit :
>> copy_from_kernel_nofault() can be called when doing read of /proc/kcore.
>> /proc/kcore can have some unmapped kfence objects which when read via
>> copy_from_kernel_nofault() can cause page faults. Since *_nofault()
>> functions define their own fixup table for handling fault, use that
>> instead of asking kfence to handle such faults.
>>
>> Hence we search the exception tables for the nip which generated the
>> fault. If there is an entry then we let the fixup table handler handle the
>> page fault by returning an error from within ___do_page_fault().
>>
>> This can be easily triggered if someone tries to do dd from /proc/kcore.
>> dd if=/proc/kcore of=/dev/null bs=1M
>>
>> <some example false negatives>
>> ===============================
>> BUG: KFENCE: invalid read in copy_from_kernel_nofault+0xb0/0x1c8
>> Invalid read at 0x000000004f749d2e:
>> copy_from_kernel_nofault+0xb0/0x1c8
>> 0xc0000000057f7950
>> read_kcore_iter+0x41c/0x9ac
>> proc_reg_read_iter+0xe4/0x16c
>> vfs_read+0x2e4/0x3b0
>> ksys_read+0x88/0x154
>> system_call_exception+0x124/0x340
>> system_call_common+0x160/0x2c4
>>
>> BUG: KFENCE: use-after-free read in copy_from_kernel_nofault+0xb0/0x1c8
>> Use-after-free read at 0x000000008fbb08ad (in kfence-#0):
>> copy_from_kernel_nofault+0xb0/0x1c8
>> 0xc0000000057f7950
>> read_kcore_iter+0x41c/0x9ac
>> proc_reg_read_iter+0xe4/0x16c
>> vfs_read+0x2e4/0x3b0
>> ksys_read+0x88/0x154
>> system_call_exception+0x124/0x340
>> system_call_common+0x160/0x2c4
>>
>> Guessing the fix should go back to when we first got kfence on PPC32.
>>
>> Fixes: 90cbac0e995d ("powerpc: Enable KFENCE for PPC32")
>> Reported-by: Disha Goel <disgoel@xxxxxxxxxxxxx>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx>
>> ---
>> arch/powerpc/mm/fault.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index 81c77ddce2e3..fa825198f29f 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -439,9 +439,17 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>> /*
>> * The kernel should never take an execute fault nor should it
>> * take a page fault to a kernel address or a page fault to a user
>> - * address outside of dedicated places
>> + * address outside of dedicated places.
>> + *
>> + * Rather than kfence reporting false negatives, let the fixup table
>> + * handler handle the page fault by returning SIGSEGV, if the fault
>> + * has come from functions like copy_from_kernel_nofault().
>> */
>> if (unlikely(!is_user && bad_kernel_fault(regs, error_code, address, is_write))) {
>> +
>> + if (search_exception_tables(instruction_pointer(regs)))
>> + return SIGSEGV;
>
> This is a heavy operation. It should at least be done only when KFENCE
> is built-in.
>
> kfence_handle_page_fault() bails out immediately when
> is_kfence_address() returns false, and is_kfence_address() returns
> always false when KFENCE is not built-in.
>
> So you could check that before calling the heavy weight
> search_exception_tables().
>
> if (is_kfence_address(address) &&
> !search_exception_tables(instruction_pointer(regs)) &&
> kfence_handle_page_fault(address, is_write, regs))
> return 0;
>
Yes, thanks for the input. I agree with above. I will take that in v3.
I will wait for sometime for any review comments on other patches before
spinning a v3, though.
>
>
> > + return SIGSEGV;
>
>> +
>> if (kfence_handle_page_fault(address, is_write, regs))
>> return 0;
>>
-ritesh