Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

From: Thomas Gleixner
Date: Tue Apr 16 2019 - 16:13:13 EST


On Tue, 16 Apr 2019, Vitaly Kuznetsov wrote:

> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>
> > On Mon, Apr 15, 2019 at 11:39:57PM +0000, Dexuan Cui wrote:
> >> > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> >> > Sent: Monday, April 15, 2019 5:21 AM
> >> > To: Dexuan Cui <decui@xxxxxxxxxxxxx>
> >> >
> >> > On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
> >> > > If smp_call_function_single() is calling the function for itself, it's safe
> >> > > to run with irqs_disabled() == true.
> >> > >
> >> > > I hit the warning because I'm in the below path in the .suspend callback of
> >> > > a "syscore_ops" to support hibernation for a VM running on Hyper-V:
> >> > >
> >> > > hv_synic_cleanup() ->
> >> > > clockevents_unbind_device() ->
> >> > > clockevents_unbind() ->
> >> > > smp_call_function_single().
> >> > >
> >> > > When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> >> > > true.
> >> >
> >> > Pray tell, how well do you think mutex_lock() works with interrupts
> >> > disabled?
> >>
> >> Good point. I realized generally speaking this patch makes no sense, so let me
> >> try the solution proposed by Vitaly, i.e. fix clockevents_unbind() instead.
> >
> > That's still not the problem. You're calling clockevents_unbind_device()
> > with IRQs disabled, that's not correct. It doesn't matter what
> > clockevents_unbind() does thereafter.
> >
>
> True. And before we start digging deeper into this, let's step back: why
> do we need to do clockevents_unbind_device() on hybernation? Can we just
> disable the device and re-enable it back on resume?

Yes. That's the right thing to do. Simple solution is to implement the
suspend/resume callbacks on the clock events device and be done with it.

> Actually, all usages of clockevents_unbind_device() in kernel are
> limited to Hyper-V and with Michael's patches moving this out of VMBus
> driver I think it can go away completely.

Correct. There was a driver which required that, but that's gone by now and
of course nobody noticed that it was the last user. The reason why this
exists was to allow switching out an active clocksource similar to the
sysfs unbind file but without user space interaction.

Thanks,

tglx