Re: sched/isolation: tick_take_do_timer_from_boot() calls smp_call_function_single() with irqs disabled

From: Oleg Nesterov
Date: Tue May 28 2024 - 08:20:57 EST


On 05/28, Nicholas Piggin wrote:
>
> >
> > So, Frederic, Nicholas, any objections to the trivial change below?
>
> Since Thomas says it's alright, then no. I guess I added it because I
> was not certain about taking the tick_do_timer_cpu while the boot CPU
> could be running a timer interrupt.

I thought about it too, but didn't see anything wrong...

Suppose that WRITE_ONCE(tick_do_timer_cpu, cpu) happens right after
tick_periodic() on the boot CPU sees READ_ONCE(tick_do_timer_cpu) == cpu.
Does this really differ from the case when tick_take_do_timer_from_boot()
waits for the boot CPU to return from timer_interrupt() ?

> I would take some of his comment to explain the race is harmless and
> put it in that if block.

Yes, yes, sure. See the patch I'll send in a minute.

> Out of curiosity, you are getting this going on x86?

Yes, and I didn't check other arch'es.

> Any particular use-case?

I have no idea. I noticed this problem when I was working on 5097cbcb38e6
("sched/isolation: Prevent boot crash when the boot CPU is nohz_full"), see
https://lore.kernel.org/all/20240411165936.GA20901@xxxxxxxxxx/

Perhaps Chris who reported that problem can add more details.

Oleg.