Re: [PATCH v3 2/2] clocksource: Use get_random_bytes() in clocksource_verify_choose_cpus()

From: Waiman Long
Date: Thu Jan 30 2025 - 11:28:27 EST



On 1/30/25 2:37 AM, Sebastian Andrzej Siewior wrote:
On 2025-01-29 17:40:01 [-0500], Waiman Long wrote:

Instead of the backtrace

The following bug report happened in a PREEMPT_RT kernel.

[ 30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
[ 30.962673] preempt_count: 1, expected: 0
[ 30.962676] RCU nest depth: 0, expected: 0
[ 30.962680] 3 locks held by kwatchdog/2012:
[ 30.962684] #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
[ 30.967703] #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
[ 30.972774] #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
[ 30.977827] Preemption disabled at:
[ 30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
[ 30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
[ 30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
[ 30.982846] Call Trace:
[ 30.982850] <TASK>
[ 30.983821] dump_stack_lvl+0x57/0x81
[ 30.983821] __might_resched.cold+0xf4/0x12f
[ 30.983824] rt_spin_lock+0x4c/0x100
[ 30.988833] get_random_u32+0x4f/0x110
[ 30.988833] clocksource_verify_choose_cpus+0xab/0x1a0
[ 30.988833] clocksource_verify_percpu.part.0+0x6b/0x330
[ 30.993894] __clocksource_watchdog_kthread+0x193/0x1a0
[ 30.993898] clocksource_watchdog_kthread+0x18/0x50
[ 30.993898] kthread+0x114/0x140
[ 30.993898] ret_from_fork+0x2c/0x50
[ 31.002864] </TASK>
which is huge and doesn't provide much unique information could please

It is due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled. The
batched_entropy_32 local lock and/or the base_crng.lock spinlock will
be acquired. In PREEMPT_RT kernel, they are rtmutexes and the above
warning will be printed if the fast path fails because of contention.
extend this?
clocksource_verify_choose_cpus() is invoked with preemption disabled, it
invokes get_random_u32() to obtain random numbers. To do so it acquires
the lock batched_entropy_32 which is a local_lock_t. This lock becomes a
sleeping lock on PREEMPT_RT and must no be acquired in atomic context.

Fix this problem by moving the clocksource_verify_choose_cpus() call
before preempt_disable() while moving the part that needs preemption to
be disabled out into a new clocksource_verify_fixup_cpus() helper that
is called after preempt_disable(). In that way, the get_random_u32()
function will now be called with preemption enabled.
Could you replace the patch below with
https://lore.kernel.org/all/20250129202909.GQNNqNoH@xxxxxxxxxxxxx/

Or is there anything that makes it not work? It looks way simpler. Just
to disable preemption during the measurement and keep the task on the
same CPU for the whole time.

Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>

I will send a v4 to incorporate your suggestion.

Cheers,
Longman