Re: [PATCH 1/5] timer: kasan: record and print timer stack

From: Marco Elver
Date: Wed Aug 12 2020 - 10:13:31 EST


On Mon, 10 Aug 2020 at 09:23, Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> This patch records the last two timer queueing stacks and prints
> up to 2 timer stacks in KASAN report. It is useful for programmers
> to solve use-after-free or double-free memory timer issues.
>
> When timer_setup() or timer_setup_on_stack() is called, then it
> prepares to use this timer and sets timer callback, we store
> this call stack in order to print it in KASAN report.
>
> Signed-off-by: Walter Wu <walter-zh.wu@xxxxxxxxxxxx>
> Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx>
> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> Cc: Alexander Potapenko <glider@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: John Stultz <john.stultz@xxxxxxxxxx>
> Cc: Stephen Boyd <sboyd@xxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> include/linux/kasan.h | 2 ++
> kernel/time/timer.c | 2 ++
> mm/kasan/generic.c | 21 +++++++++++++++++++++
> mm/kasan/kasan.h | 4 +++-
> mm/kasan/report.c | 11 +++++++++++
> 5 files changed, 39 insertions(+), 1 deletion(-)

I'm commenting on the code here, but it also applies to patch 2/5, as
it's almost a copy-paste.

In general, I'd say the solution to get this feature is poorly
designed, resulting in excessive LOC added. The logic added already
exists for the aux stacks.

> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index 23b7ee00572d..763664b36dc6 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -175,12 +175,14 @@ static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }
> void kasan_cache_shrink(struct kmem_cache *cache);
> void kasan_cache_shutdown(struct kmem_cache *cache);
> void kasan_record_aux_stack(void *ptr);
> +void kasan_record_tmr_stack(void *ptr);
>
> #else /* CONFIG_KASAN_GENERIC */
>
> static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
> static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
> static inline void kasan_record_aux_stack(void *ptr) {}
> +static inline void kasan_record_tmr_stack(void *ptr) {}

It appears that the 'aux' stack is currently only used for call_rcu
stacks, but this interface does not inherently tie it to call_rcu. The
only thing tying it to call_rcu is the fact that the report calls out
call_rcu.

> /**
> diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> index 4b3cbad7431b..f35dcec990ab 100644
> --- a/mm/kasan/generic.c
> +++ b/mm/kasan/generic.c
> @@ -347,6 +347,27 @@ void kasan_record_aux_stack(void *addr)
> alloc_info->aux_stack[0] = kasan_save_stack(GFP_NOWAIT);
> }
>
> +void kasan_record_tmr_stack(void *addr)
> +{
> + struct page *page = kasan_addr_to_page(addr);
> + struct kmem_cache *cache;
> + struct kasan_alloc_meta *alloc_info;
> + void *object;
> +
> + if (!(page && PageSlab(page)))
> + return;
> +
> + cache = page->slab_cache;
> + object = nearest_obj(cache, page, addr);
> + alloc_info = get_alloc_info(cache, object);
> +
> + /*
> + * record the last two timer stacks.
> + */
> + alloc_info->tmr_stack[1] = alloc_info->tmr_stack[0];
> + alloc_info->tmr_stack[0] = kasan_save_stack(GFP_NOWAIT);
> +}

The solution here is, unfortunately, poorly designed. This is a
copy-paste of the kasan_record_aux_stack() function.

> void kasan_set_free_info(struct kmem_cache *cache,
> void *object, u8 tag)
> {
> diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> index ef655a1c6e15..c50827f388a3 100644
> --- a/mm/kasan/kasan.h
> +++ b/mm/kasan/kasan.h
> @@ -108,10 +108,12 @@ struct kasan_alloc_meta {
> struct kasan_track alloc_track;
> #ifdef CONFIG_KASAN_GENERIC
> /*
> - * call_rcu() call stack is stored into struct kasan_alloc_meta.
> + * call_rcu() call stack and timer queueing stack are stored
> + * into struct kasan_alloc_meta.
> * The free stack is stored into struct kasan_free_meta.
> */
> depot_stack_handle_t aux_stack[2];
> + depot_stack_handle_t tmr_stack[2];
> #else
> struct kasan_track free_track[KASAN_NR_FREE_STACKS];
> #endif
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index fed3c8fdfd25..6fa3bfee381f 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -191,6 +191,17 @@ static void describe_object(struct kmem_cache *cache, void *object,
> print_stack(alloc_info->aux_stack[1]);
> pr_err("\n");
> }
> +
> + if (alloc_info->tmr_stack[0]) {
> + pr_err("Last timer stack:\n");
> + print_stack(alloc_info->tmr_stack[0]);
> + pr_err("\n");
> + }
> + if (alloc_info->tmr_stack[1]) {
> + pr_err("Second to last timer stack:\n");
> + print_stack(alloc_info->tmr_stack[1]);
> + pr_err("\n");
> + }

Why can't we just use the aux stack for everything, and simply change
the message printed in the report. After all, the stack trace will
include all the information to tell if it's call_rcu, timer, or
workqueue.

The reporting code would simply have to be changed to say something
like "Last potentially related work creation:" -- because what the
"aux" thing is trying to abstract are stack traces to work creations
that took an address that is closely related. Whether or not you want
to call it "work" is up to you, but that's the most generic term I
could think of right now (any better terms?).

Another argument for this consolidation is that it's highly unlikely
that aux_stack[a] && tmr_stack[b] && wq_stack[c], and you need to
print all the stacks. If you are worried we need more aux stacks, just
make the array size 3+ (but I think it's not necessary).

Thanks,
-- Marco