Re: [PATCH] timer: fix a debugobjects warning in del_timer()
From: Qian Cai
Date: Sat Dec 07 2019 - 19:25:21 EST
> On Dec 7, 2019, at 2:58 AM, Qian Cai <cai@xxxxxx> wrote:
>
> Since the commit e33026831bdb ("x86/intel_rdt/mbm: Handle counter
> overflow"), it will generate a debugobjects warning while offlining
> CPUs.
>
> ODEBUG: assert_init not available (active state 0) object type:
> timer_list hint: 0x0
> WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
> debug_print_object+0xfe/0x140
> Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
> I40 05/23/2018
> RIP: 0010:debug_print_object+0xfe/0x140
> Call Trace:
> debug_object_assert_init+0x1f5/0x240
> del_timer+0x6f/0xf0
> try_to_grab_pending+0x42/0x3c0
> cancel_delayed_work+0x7d/0x150
> resctrl_offline_cpu+0x3c0/0x520
> cpuhp_invoke_callback+0x197/0x1120
> cpuhp_thread_fun+0x252/0x2f0
> smpboot_thread_fn+0x255/0x440
> kthread+0x1e6/0x210
> ret_from_fork+0x3a/0x50
>
> This is because in domain_remove_cpu() when "cpu == d->mbm_work_cpu", it
> calls cancel_delayed_work(&d->mbm_over) to deactivate the timer, and
> then mbm_setup_overflow_handler() calls schedule_delayed_work_on() with
> 0 delay which does not activiate the timer in __queue_delayed_work().
>
> Later, when the last CPU in the same L3 cache goes offline, it calls
> cancel_delayed_work(&d->mbm_over) again in domain_remove_cpu() and
> trigger the warning because the timer is still inactive.
>
> Since del_timer() could be called on both active and inactive timers,
> debug_assert_init() should be called only when there is an active timer.
>
> Signed-off-by: Qian Cai <cai@xxxxxx>
Self-NACK this and Iâll post a more correct patch.