Re: [PATCH v3 01/12] powerpc: mm/fault: Fix kfence page fault reporting
From: IBM
Date: Mon Oct 21 2024 - 23:40:00 EST
Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:
> Hi Ritesh,
>
> "Ritesh Harjani (IBM)" <ritesh.list@xxxxxxxxx> writes:
>> copy_from_kernel_nofault() can be called when doing read of /proc/kcore.
>> /proc/kcore can have some unmapped kfence objects which when read via
>> copy_from_kernel_nofault() can cause page faults. Since *_nofault()
>> functions define their own fixup table for handling fault, use that
>> instead of asking kfence to handle such faults.
>>
>> Hence we search the exception tables for the nip which generated the
>> fault. If there is an entry then we let the fixup table handler handle the
>> page fault by returning an error from within ___do_page_fault().
>>
>> This can be easily triggered if someone tries to do dd from /proc/kcore.
>> dd if=/proc/kcore of=/dev/null bs=1M
>>
>> <some example false negatives>
>> ===============================
>> BUG: KFENCE: invalid read in copy_from_kernel_nofault+0xb0/0x1c8
>> Invalid read at 0x000000004f749d2e:
>> copy_from_kernel_nofault+0xb0/0x1c8
>> 0xc0000000057f7950
>> read_kcore_iter+0x41c/0x9ac
>> proc_reg_read_iter+0xe4/0x16c
>> vfs_read+0x2e4/0x3b0
>> ksys_read+0x88/0x154
>> system_call_exception+0x124/0x340
>> system_call_common+0x160/0x2c4
>
> I haven't been able to reproduce this. Can you give some more details on
> the exact machine/kernel-config/setup where you saw this?
w/o this patch I am able to hit this on book3s64 with both Radix and
Hash. I believe these configs should do the job. We should be able to
reproduce it on qemu and/or LPAR or baremetal.
root-> cat .out-ppc/.config |grep -i KFENCE
CONFIG_HAVE_ARCH_KFENCE=y
CONFIG_KFENCE=y
CONFIG_KFENCE_SAMPLE_INTERVAL=100
CONFIG_KFENCE_NUM_OBJECTS=255
# CONFIG_KFENCE_DEFERRABLE is not set
# CONFIG_KFENCE_STATIC_KEYS is not set
CONFIG_KFENCE_STRESS_TEST_FAULTS=0
CONFIG_KFENCE_KUNIT_TEST=y
root-> cat .out-ppc/.config |grep -i KCORE
CONFIG_PROC_KCORE=y
root-> cat .out-ppc/.config |grep -i KUNIT
CONFIG_KFENCE_KUNIT_TEST=y
CONFIG_KUNIT=y
CONFIG_KUNIT_DEFAULT_ENABLED=y
Then doing running dd like below can hit the issue. Maybe let it run for
few mins and see?
~ # dd if=/proc/kcore of=/dev/null bs=1M
Otherwise running this kfence kunit test also can reproduce the same
bug [1]. Above configs have kfence kunit config shown as well which will
run during boot time itself.
[1]: https://lore.kernel.org/linuxppc-dev/210e561f7845697a32de44b643393890f180069f.1729272697.git.ritesh.list@xxxxxxxxx/
Note: This was originally reported internally in which the tester was
doing - perf test 'Object code reading' [2]
[2]: https://github.com/torvalds/linux/blob/master/tools/perf/tests/code-reading.c#L737
Thanks for looking into this. Let me know if this helped.
-ritesh