Re: [PATCH] fork, vmalloc: KASAN-poison backing pages of vmapped stacks

From: Dmitry Vyukov
Date: Wed Jan 18 2023 - 03:03:52 EST


On Tue, 17 Jan 2023 at 17:35, Jann Horn <jannh@xxxxxxxxxx> wrote:
>
> KASAN (except in HW_TAGS mode) tracks memory state based on virtual
> addresses. The mappings of kernel stack pages in the linear mapping are
> currently marked as fully accessible.

Hi Jann,

To confirm my understanding, this is not just KASAN (except in HW_TAGS
mode), but also CONFIG_VMAP_STACK is required, right?

> Since stack corruption issues can cause some very gnarly errors, let's be
> extra careful and tell KASAN to forbid accesses to stack memory through the
> linear mapping.
>
> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> ---
> I wrote this after seeing
> https://lore.kernel.org/all/Y8W5rjKdZ9erIF14@xxxxxxxxxxxxxxxxxxxx/
> and wondering about possible ways that this kind of stack corruption
> could be sneaking past KASAN.
> That's proooobably not the explanation, but still...

I think catching any silent corruptions is still very useful. Besides
confusing reports, sometimes they lead to an explosion of random
reports all over the kernel.

> include/linux/vmalloc.h | 6 ++++++
> kernel/fork.c | 10 ++++++++++
> mm/vmalloc.c | 24 ++++++++++++++++++++++++
> 3 files changed, 40 insertions(+)
>
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 096d48aa3437..bfb50178e5e3 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -297,4 +297,10 @@ bool vmalloc_dump_obj(void *object);
> static inline bool vmalloc_dump_obj(void *object) { return false; }
> #endif
>
> +#if defined(CONFIG_MMU) && (defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS))
> +void vmalloc_poison_backing_pages(const void *addr);
> +#else
> +static inline void vmalloc_poison_backing_pages(const void *addr) {}
> +#endif

I think this should be in kasan headers and prefixed with kasan_.
There are also kmsan/kcsan that may poison memory and hw poisoning
(MADV_HWPOISON), so it's a somewhat overloaded term on its own.

Can/should this be extended to all vmalloc-ed memory? Or some of it
can be accessed via both addresses?

Also, should we mprotect it instead while it's allocated as the stack?
If it works, it looks like a reasonable improvement for
CONFIG_VMAP_STACK in general. Would also catch non-instrumented
accesses.

> #endif /* _LINUX_VMALLOC_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 9f7fe3541897..5c8c103a3597 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -321,6 +321,16 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> vfree(stack);
> return -ENOMEM;
> }
> +
> + /*
> + * A virtually-allocated stack's memory should only be accessed through
> + * the vmalloc area, not through the linear mapping.
> + * Inform KASAN that all accesses through the linear mapping should be
> + * reported (instead of permitting all accesses through the linear
> + * mapping).
> + */
> + vmalloc_poison_backing_pages(stack);
> +
> /*
> * We can't call find_vm_area() in interrupt context, and
> * free_thread_stack() can be called in interrupt context,
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index ca71de7c9d77..10c79c53cf5c 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -4042,6 +4042,30 @@ void pcpu_free_vm_areas(struct vm_struct **vms, int nr_vms)
> }
> #endif /* CONFIG_SMP */
>
> +#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
> +/*
> + * Poison the KASAN shadow for the linear mapping of the pages used as stack
> + * memory.
> + * NOTE: This makes no sense in HW_TAGS mode because HW_TAGS marks physical
> + * memory, not virtual memory.
> + */
> +void vmalloc_poison_backing_pages(const void *addr)
> +{
> + struct vm_struct *area;
> + int i;
> +
> + if (WARN(!PAGE_ALIGNED(addr), "bad address (%p)\n", addr))
> + return;
> +
> + area = find_vm_area(addr);
> + if (WARN(!area, "nonexistent vm area (%p)\n", addr))
> + return;
> +
> + for (i = 0; i < area->nr_pages; i++)
> + kasan_poison_pages(area->pages[i], 0, false);
> +}
> +#endif
> +
> #ifdef CONFIG_PRINTK
> bool vmalloc_dump_obj(void *object)
> {
>
> base-commit: 5dc4c995db9eb45f6373a956eb1f69460e69e6d4
> --
> 2.39.0.314.g84b9a713c41-goog
>