On Fri, Jan 24, 2025 at 03:41:45PM -0500, Waiman Long wrote:
On 1/24/25 3:37 PM, Paul E. McKenney wrote:If the function is migrated from one CPU to another, the check against
On Fri, Jan 24, 2025 at 12:25:19PM -0800, Paul E. McKenney wrote:Does the CPU choosing process itsself needs to have preemption disabled? I
On Fri, Jan 24, 2025 at 01:59:23PM -0500, Waiman Long wrote:I suppose I should add my concern... If we don't have preemption disabled
The following bug report happened in a PREEMPT_RT kernel.Good catch!
[ 30.957705] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[ 30.957711] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2012, name: kwatchdog
[ 30.962673] preempt_count: 1, expected: 0
[ 30.962676] RCU nest depth: 0, expected: 0
[ 30.962680] 3 locks held by kwatchdog/2012:
[ 30.962684] #0: ffffffff8af2da60 (clocksource_mutex){+.+.}-{3:3}, at: clocksource_watchdog_kthread+0x13/0x50
[ 30.967703] #1: ffffffff8aa8d4d0 (cpu_hotplug_lock){++++}-{0:0}, at: clocksource_verify_percpu.part.0+0x5c/0x330
[ 30.972774] #2: ffff9fe02f5f33e0 ((batched_entropy_u32.lock)){+.+.}-{2:2}, at: get_random_u32+0x4f/0x110
[ 30.977827] Preemption disabled at:
[ 30.977830] [<ffffffff88c1fe56>] clocksource_verify_percpu.part.0+0x66/0x330
[ 30.982837] CPU: 33 PID: 2012 Comm: kwatchdog Not tainted 5.14.0-503.23.1.el9_5.x86_64+rt-debug #1
[ 30.982843] Hardware name: HPE ProLiant DL385 Gen10 Plus/ProLiant DL385 Gen10 Plus, BIOS A42 04/29/2021
[ 30.982846] Call Trace:
[ 30.982850] <TASK>
[ 30.983821] dump_stack_lvl+0x57/0x81
[ 30.983821] __might_resched.cold+0xf4/0x12f
[ 30.983824] rt_spin_lock+0x4c/0x100
[ 30.988833] get_random_u32+0x4f/0x110
[ 30.988833] clocksource_verify_choose_cpus+0xab/0x1a0
[ 30.988833] clocksource_verify_percpu.part.0+0x6b/0x330
[ 30.993894] __clocksource_watchdog_kthread+0x193/0x1a0
[ 30.993898] clocksource_watchdog_kthread+0x18/0x50
[ 30.993898] kthread+0x114/0x140
[ 30.993898] ret_from_fork+0x2c/0x50
[ 31.002864] </TASK>
It is due to the fact that get_random_u32() is called in
clocksource_verify_choose_cpus() with preemption disabled.
If crng_ready() is true by the time get_random_u32() is called, The
batched_entropy_32 local lock will be acquired. In PREEMPT_RT kernel,
it is a rtmutex and we can't acquire it with preemption disabled.
To avoid this problem, we can't call get_random_u32() with preemption
disabled. However, smp_processor_id() has to be called with preemption
disabled though and we have to exclude the current CPU from the
cpus_chosen list to be tested.
Extract current CPU removal code out from
clocksource_verify_choose_cpus() and defer the preempt_disable()
call to after clocksource_verify_choose_cpus() and before
current CPU removal. Also use raw_smp_processor_id() in
clocksource_verify_choose_cpus().
Fixes: 7560c02bdffb ("clocksource: Check per-CPU clock synchronization when marked unstable")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
But we don't need cryptographically secure random numbers (or blistering
speed) here. Substituting something like torture_random()%nr_cpu_ids
for get_random_u32_below() work?
across this code, we cannot reliably avoid attempting to IPI ourselves.
thought it was because of need to use smp_processor_id() and have the
current CPU excluded. Preemption is still disabled after that.
raw_smp_processor_id() doesn't mean much. At that point, you might
as well just rely in the calling function clearing it.
And the extra preemption happens only in an error condition were extra
debugging is enabled. Plus clocksource_verify_choose_cpus() should be
pretty quick. So do we really care about the additional disabling of
preemption?