Re: [PATCH v3 10/14] x86: Update the KASAN non-canonical hook

From: Maciej Wieczor-Retman
Date: Wed Apr 09 2025 - 10:38:13 EST


On 2025-04-04 at 10:37:53 -0700, Dave Hansen wrote:
>On 4/4/25 06:14, Maciej Wieczor-Retman wrote:
>> The kasan_non_canonical_hook() is useful in pointing out that an address
>> which caused some kind of error could be the result of
>> kasan_mem_to_shadow() mapping. Currently it's called only in the general
>> protection handler code path but can give helpful information also in
>> page fault oops reports.
>>
>> For example consider a page fault for address 0xffdefc0000000000 on a
>> 5-level paging system. It could have been accessed from KASAN's
>> kasan_mem_to_shadow() called on 0xfef0000000000000 address. Without the
>> kasan_non_canonical_hook() in the page fault case it might be hard to
>> figure out why an error occurred.
>>
>> Add kasan_non_canonical_hook() to the beginning of show_fault_oops().
>>
>> Update kasan_non_canonical_hook() to take into account the possible
>> memory to shadow mappings in the software tag-based mode of x86.
>>
>> Patch was tested with positive results by accessing the following
>> addresses, causing #GPs and #PFs.
>>
>> Valid mappings (showing kasan_non_canonical_hook() message):
>> 0xFFFFFFFF8FFFFFFF
>> 0xFEF0000000000000
>> 0x7FFFFF4FFFFFFFFF
>> 0x7EF0000000000000
>> Invalid mappings (not showing kasan_non_canonical_hook() message):
>> 0xFFFFFFFFF8FFFFFF
>> 0xFFBFFC0000000000
>> 0x07EFFC0000000000
>> 0x000E000000000000
>>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
>> ---
>> Changelog v3:
>> - Move the report.c part from first patch in the series to this new
>> patch to have x86 changes in one place.
>> - Add the call in fault oops.
>> - Extend the comment in report.c with a graphical representation of what
>> addresses are valid and invalid in memory to shadow mapping.
>>
>> arch/x86/mm/fault.c | 2 ++
>> mm/kasan/report.c | 36 +++++++++++++++++++++++++++++++++++-
>> 2 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index 697432f63c59..16366af60ae5 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -511,6 +511,8 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
>> if (!oops_may_print())
>> return;
>>
>> + kasan_non_canonical_hook(address);
>> +
>> if (error_code & X86_PF_INSTR) {
>> unsigned int level;
>> bool nx, rw;
>> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
>> index f24f11cc644a..135307c93c2c 100644
>> --- a/mm/kasan/report.c
>> +++ b/mm/kasan/report.c
>> @@ -700,7 +700,7 @@ void kasan_non_canonical_hook(unsigned long addr)
>> * operation would overflow only for some memory addresses. However, due
>> * to the chosen KASAN_SHADOW_OFFSET values and the fact the
>> * kasan_mem_to_shadow() only operates on pointers with the tag reset,
>> - * the overflow always happens.
>> + * the overflow always happens (for both x86 and arm64).
>> *
>> * For arm64, the top byte of the pointer gets reset to 0xFF. Thus, the
>> * possible shadow addresses belong to a region that is the result of
>> @@ -715,6 +715,40 @@ void kasan_non_canonical_hook(unsigned long addr)
>> return;
>> }
>>
>> + /*
>> + * For x86-64, only the pointer bits [62:57] get reset, and bits #63
>> + * and #56 can be 0 or 1. Thus, kasan_mem_to_shadow() can be possibly
>> + * applied to two regions of memory:
>> + * [0x7E00000000000000, 0x7FFFFFFFFFFFFFFF] and
>> + * [0xFE00000000000000, 0xFFFFFFFFFFFFFFFF]. As the overflow happens
>> + * for both ends of both memory ranges, both possible shadow regions
>> + * are contiguous.
>> + *
>> + * Given the KASAN_SHADOW_OFFSET equal to 0xffeffc0000000000, the
>> + * following ranges are valid mem-to-shadow mappings:
>> + *
>> + * 0xFFFFFFFFFFFFFFFF
>> + * INVALID
>> + * 0xFFEFFBFFFFFFFFFF - kasan_mem_to_shadow(~0UL)
>> + * VALID - kasan shadow mem
>> + * VALID - non-canonical kernel virtual address
>> + * 0xFFCFFC0000000000 - kasan_mem_to_shadow(0xFEUL << 56)
>> + * INVALID
>> + * 0x07EFFBFFFFFFFFFF - kasan_mem_to_shadow(~0UL >> 1)
>> + * VALID - non-canonical user virtual addresses
>> + * VALID - user addresses
>> + * 0x07CFFC0000000000 - kasan_mem_to_shadow(0x7EUL << 56)
>> + * INVALID
>> + * 0x0000000000000000
>> + */
>> + if (IS_ENABLED(CONFIG_KASAN_SW_TAGS) && IS_ENABLED(CONFIG_X86_64)) {
>
>One overall comment on this series: there's a lot of unnecessary
>complexity. Case in point:
>
> config ADDRESS_MASKING
> depends on X86_64
>
>and
>
> select HAVE_ARCH_KASAN_SW_TAGS if ADDRESS_MASKING
>
>and
>
> config KASAN_SW_TAGS
> depends on HAVE_ARCH_KASAN_SW_TAGS ...
>
>
>So you can't have CONFIG_KASAN_SW_TAGS set without CONFIG_X86_64.

This line was there so it only runs on x86 and not on arm64. So yeah, it looks
redundant but if we remove the first condition we might get some bogus info in
generic mode, and if we remove the second condition people on arm might get some
weird output.

If you still think this needs to get separated somehow then perhaps moving all
the CONFIG_KASAN_SW_TAGS code to a separate function might do it? And then there
would only be something like if(x86) else if(arm).

>
>> + if ((addr < (u64)kasan_mem_to_shadow((void *)(0x7E UL << 56)) ||
>> + addr > (u64)kasan_mem_to_shadow((void *)(~0UL >> 1))) &&
>> + (addr < (u64)kasan_mem_to_shadow((void *)(0xFE UL << 56)) ||
>> + addr > (u64)kasan_mem_to_shadow((void *)(~0UL))))
>> + return;
>> + }
>This isn't looking great.
>
>I'd much rather have those kasan_mem_to_shadow() arguments be built up
>programmatically.
>
>I'm also not following the description of where these ranges come from:
>
> [0x7E00000000000000, 0x7FFFFFFFFFFFFFFF]
> [0xFE00000000000000, 0xFFFFFFFFFFFFFFFF]
>
>I obviously recognize the top kernel and top userspace addresses, but
>there do 0x7E... and 0xFE... come from? Is that because both of them
>only have 56 actual bits of address space?

We want to cover the whole LA mapping to shadow memory. These ranges are due to
the compiler masking out tag bits (in this case the 6 LAM bits). So after
masking out the [57:62] bits, we get these two contiguous ranges. All the other
addresses are invalid mappings when using kasan_mem_to_shadow().

We had a long exchange with Andrey on how to cover all the edge cases. Perhaps
this message summarizes it best [1].

>
>Wouldn't we be better off writing that as, say:
>
>#define HIGHEST_KER_ADDR (void *)0xFFFFFFFFFFFFFFFF
>// ^ we probably have some macro for that already
>#define LOWEST_KERN_ADDR (void *)(HIGHEST_KERNEL_ADDRESS - \
> (1<<56) + 1)
>// ^ or can this be calculated by tag manipulation?
>
>which yields:
>
> void *_addr = (u64)addr;
> ...
>
> in_kern_shadow = (_addr >= kasan_mem_to_shadow(LOWEST_KERN_ADDR) ||
> (_addr <= kasan_mem_to_shadow(HIGHEST_KERN_ADDR);
> in_user_shadow = (_addr >= kasan_mem_to_shadow(LOWEST_USER_ADDR) ||
> (_addr <= kasan_mem_to_shadow(HIGHEST_USER_ADDR);
>
> if (!in_kern_shadow &&
> !in_user_shadow)
> return;
>
>I _think_ that's the same logic you have. Isn't it slightly more readable?

Yes, I like it more than just generating the addresses in the parenthesis. What
do you think about this naming? KASAN prefix and [k/u]addr since it's not really
the lowest/highest address in the whole LA, just in this KASAN compiler scheme.
And I changed 1<<56 to 2<<56 so it generates 0xFE00000000000000 instead of
0xFF00000000000000.

#define KASAN_HIGHEST_KADDR (void *)0xFFFFFFFFFFFFFFFF
#define KASAN_LOWEST_KADDR (void *)(KASAN_HIGHEST_KADDR - \
(2<<56) + 1)
#define KASAN_HIGHEST_UADDR (void *)0x7FFFFFFFFFFFFFFF
#define KASAN_LOWEST_UADDR (void *)(KASAN_HIGHEST_UADDR - \
(2<<56) + 1)

[1] https://lore.kernel.org/all/CA+fCnZdUFO0+G9HHy4oaQfEx8sm3D_ZfxdkH3y2ZojjYqTN74Q@xxxxxxxxxxxxxx/

--
Kind regards
Maciej Wieczór-Retman