Re: [RESEND PATCH v2] clocksource/arm_arch_timer: fix a lockdep warning

From: Marc Zyngier
Date: Mon Dec 10 2018 - 09:48:11 EST


On 10/12/2018 13:52, Qian Cai wrote:
> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
> warning.
>
> [ 0.000000] lockdep_assert_cpus_held+0x50/0x60
> [ 0.000000] static_key_enable_cpuslocked+0x30/0xe8
> [ 0.000000] arch_timer_check_ool_workaround+0x128/0x2d0
> [ 0.000000] arch_timer_acpi_init+0x274/0x6ac
> [ 0.000000] acpi_table_parse+0x1ac/0x218
> [ 0.000000] __acpi_probe_device_table+0x164/0x1ec
> [ 0.000000] timer_probe+0x1bc/0x254
> [ 0.000000] time_init+0x44/0x98
> [ 0.000000] start_kernel+0x4ec/0x7d4
>
> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
> the hotplug lock for _cpuslocked() operations").
>
> Since it is applying a global workaround to all CPUs here, it did not hold
> any CPU locks in this path.
>
> arch_timer_acpi_init
> arch_timer_check_ool_workaround(ate_match_acpi_oem_info, table)
> arch_timer_enable_workaround(wa, local = false)
> for_each_possible_cpu()
> per_cpu()
>
> There is also another path did not have any CPU lock.
>
> time_init
> clocksource_probe
> arch_timer_of_init
> arch_timer_check_ool_workaround(ate_match_dt, np)
> arch_timer_enable_workaround(wa, local = false)
>
> When hot-adding a CPU, it will go with a slightly different route.
>
> arch_timer_starting_cpu
> __arch_timer_setup
> arch_timer_check_ool_workaround(ate_match_local_cap_id, NULL)
> arch_timer_enable_workaround(wa, local = true)
> __this_cpu_write()
>
> Hence, deal with them differently.
>
> Fixes: 450f9689f294 (clocksource/arm_arch_timer: Use
> static_branch_enable_cpuslocked())
> Signed-off-by: Qian Cai <cai@xxxxxx>
> ---
>
> v2: fix the root cause instead of a workaround.
>
> drivers/clocksource/arm_arch_timer.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 9a7d4dc00b6e..81dca7d31d13 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -492,17 +492,20 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
>
> if (local) {
> __this_cpu_write(timer_unstable_counter_workaround, wa);
> +
> + /*
> + * Use the locked version, as we're called from the CPU
> + * hotplug framework. Otherwise, we end-up in
> + * deadlock-land.
> + */
> + static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);

I have the ugly feeling that it breaks the (equally ugly) big-little
stuff where the boot CPU is affected. In this context, you'd end-up
without the lock being taken and you'll get the same splat.

We could start testing the CPU number, but Peter's approach seem much
more palatable to me.

Thanks,

M.
--
Jazz is not dead. It just smells funny...