Re: [PATCH 1/5] timer: kasan: record and print timer stack

From: Walter Wu
Date: Wed Aug 12 2020 - 21:46:35 EST


On Wed, 2020-08-12 at 16:13 +0200, Marco Elver wrote:
> On Mon, 10 Aug 2020 at 09:23, Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
> > This patch records the last two timer queueing stacks and prints
> > up to 2 timer stacks in KASAN report. It is useful for programmers
> > to solve use-after-free or double-free memory timer issues.
> >
> > When timer_setup() or timer_setup_on_stack() is called, then it
> > prepares to use this timer and sets timer callback, we store
> > this call stack in order to print it in KASAN report.
> >
> > Signed-off-by: Walter Wu <walter-zh.wu@xxxxxxxxxxxx>
> > Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx>
> > Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > Cc: Alexander Potapenko <glider@xxxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: John Stultz <john.stultz@xxxxxxxxxx>
> > Cc: Stephen Boyd <sboyd@xxxxxxxxxx>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > ---
> > include/linux/kasan.h | 2 ++
> > kernel/time/timer.c | 2 ++
> > mm/kasan/generic.c | 21 +++++++++++++++++++++
> > mm/kasan/kasan.h | 4 +++-
> > mm/kasan/report.c | 11 +++++++++++
> > 5 files changed, 39 insertions(+), 1 deletion(-)
>
> I'm commenting on the code here, but it also applies to patch 2/5, as
> it's almost a copy-paste.
>
> In general, I'd say the solution to get this feature is poorly
> designed, resulting in excessive LOC added. The logic added already
> exists for the aux stacks.
>

That's true, we will have refactoring for it.

> > diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> > index 23b7ee00572d..763664b36dc6 100644
> > --- a/include/linux/kasan.h
> > +++ b/include/linux/kasan.h
> > @@ -175,12 +175,14 @@ static inline size_t kasan_metadata_size(struct kmem_cache *cache) { return 0; }
> > void kasan_cache_shrink(struct kmem_cache *cache);
> > void kasan_cache_shutdown(struct kmem_cache *cache);
> > void kasan_record_aux_stack(void *ptr);
> > +void kasan_record_tmr_stack(void *ptr);
> >
> > #else /* CONFIG_KASAN_GENERIC */
> >
> > static inline void kasan_cache_shrink(struct kmem_cache *cache) {}
> > static inline void kasan_cache_shutdown(struct kmem_cache *cache) {}
> > static inline void kasan_record_aux_stack(void *ptr) {}
> > +static inline void kasan_record_tmr_stack(void *ptr) {}
>
> It appears that the 'aux' stack is currently only used for call_rcu
> stacks, but this interface does not inherently tie it to call_rcu. The
> only thing tying it to call_rcu is the fact that the report calls out
> call_rcu.
>
> > /**
> > diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
> > index 4b3cbad7431b..f35dcec990ab 100644
> > --- a/mm/kasan/generic.c
> > +++ b/mm/kasan/generic.c
> > @@ -347,6 +347,27 @@ void kasan_record_aux_stack(void *addr)
> > alloc_info->aux_stack[0] = kasan_save_stack(GFP_NOWAIT);
> > }
> >
> > +void kasan_record_tmr_stack(void *addr)
> > +{
> > + struct page *page = kasan_addr_to_page(addr);
> > + struct kmem_cache *cache;
> > + struct kasan_alloc_meta *alloc_info;
> > + void *object;
> > +
> > + if (!(page && PageSlab(page)))
> > + return;
> > +
> > + cache = page->slab_cache;
> > + object = nearest_obj(cache, page, addr);
> > + alloc_info = get_alloc_info(cache, object);
> > +
> > + /*
> > + * record the last two timer stacks.
> > + */
> > + alloc_info->tmr_stack[1] = alloc_info->tmr_stack[0];
> > + alloc_info->tmr_stack[0] = kasan_save_stack(GFP_NOWAIT);
> > +}
>
> The solution here is, unfortunately, poorly designed. This is a
> copy-paste of the kasan_record_aux_stack() function.
>

kasan_record_aux_stack() will be re-used for call_rcu, timer, and
workqueue.

> > void kasan_set_free_info(struct kmem_cache *cache,
> > void *object, u8 tag)
> > {
> > diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h
> > index ef655a1c6e15..c50827f388a3 100644
> > --- a/mm/kasan/kasan.h
> > +++ b/mm/kasan/kasan.h
> > @@ -108,10 +108,12 @@ struct kasan_alloc_meta {
> > struct kasan_track alloc_track;
> > #ifdef CONFIG_KASAN_GENERIC
> > /*
> > - * call_rcu() call stack is stored into struct kasan_alloc_meta.
> > + * call_rcu() call stack and timer queueing stack are stored
> > + * into struct kasan_alloc_meta.
> > * The free stack is stored into struct kasan_free_meta.
> > */
> > depot_stack_handle_t aux_stack[2];
> > + depot_stack_handle_t tmr_stack[2];
> > #else
> > struct kasan_track free_track[KASAN_NR_FREE_STACKS];
> > #endif
> > diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> > index fed3c8fdfd25..6fa3bfee381f 100644
> > --- a/mm/kasan/report.c
> > +++ b/mm/kasan/report.c
> > @@ -191,6 +191,17 @@ static void describe_object(struct kmem_cache *cache, void *object,
> > print_stack(alloc_info->aux_stack[1]);
> > pr_err("\n");
> > }
> > +
> > + if (alloc_info->tmr_stack[0]) {
> > + pr_err("Last timer stack:\n");
> > + print_stack(alloc_info->tmr_stack[0]);
> > + pr_err("\n");
> > + }
> > + if (alloc_info->tmr_stack[1]) {
> > + pr_err("Second to last timer stack:\n");
> > + print_stack(alloc_info->tmr_stack[1]);
> > + pr_err("\n");
> > + }
>
> Why can't we just use the aux stack for everything, and simply change
> the message printed in the report. After all, the stack trace will
> include all the information to tell if it's call_rcu, timer, or
> workqueue.
>

This is a good suggestion, next patch will do it.

> The reporting code would simply have to be changed to say something
> like "Last potentially related work creation:" -- because what the
> "aux" thing is trying to abstract are stack traces to work creations
> that took an address that is closely related. Whether or not you want
> to call it "work" is up to you, but that's the most generic term I
> could think of right now (any better terms?).
>

Work is good.

> Another argument for this consolidation is that it's highly unlikely
> that aux_stack[a] && tmr_stack[b] && wq_stack[c], and you need to
> print all the stacks. If you are worried we need more aux stacks, just
> make the array size 3+ (but I think it's not necessary).
>

We hope that next patch keep it as it is aux_stack[2], it may record
call_rcu, timer, or workqueue. Because struct kasan_alloc_meta is 16
bytes, it will have the minimal redzone size and good alignment, lets
get better memory consumption.


Thanks for your good suggestion.

Walter


> Thanks,
> -- Marco