Re: [PATCH] smp: Allow smp_call_function_single_async() to insert locked csd

From: Peter Zijlstra
Date: Tue Dec 17 2019 - 15:24:04 EST


On Tue, Dec 17, 2019 at 10:31:28AM -0500, Peter Xu wrote:
> On Tue, Dec 17, 2019 at 10:51:56AM +0100, Peter Zijlstra wrote:
> > On Mon, Dec 16, 2019 at 03:58:33PM -0500, Peter Xu wrote:
> > > On Mon, Dec 16, 2019 at 09:37:05PM +0100, Peter Zijlstra wrote:
> > > > On Wed, Dec 11, 2019 at 11:29:25AM -0500, Peter Xu wrote:
> >
> > > > > (3) Others:
> > > > >
> > > > > *** arch/mips/kernel/process.c:
> > > > > raise_backtrace[713] smp_call_function_single_async(cpu, csd);
> > > >
> > > > per-cpu csd data, seems perfectly fine usage.
> > >
> > > I'm not sure whether I get the point, I just feel like it could still
> > > trigger as long as we do it super fast, before IPI handled,
> > > disregarding whether it's per-cpu csd or not.
> >
> > No, I wasn't paying attention last night. I'm thinking this one might
> > maybe be in 1). It does the state check using that bitmap.
>
> Indeed. Though I'm not very certain to change this one too, since I'm
> not sure whether that pr_warn is really intended:
>
> if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
> pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
> cpu);
> continue;
> }
>
> I mean, that should depend on if it can really hang somehow (or it's
> the same issue as what we're trying to fix)... If it won't hang, then
> it should be safe I think, and this pr_warn could be helpless after all.

Yeah, leave it.

> > I suspect to be nice for virt. Both CPUID and MSR accesses can trap. but
> > now I'm confused, because it is mostly WRMSR that traps.
> >
> > Anyway, see the commit here: 07cde313b2d2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule")
>
> Yes that makes sense. Thanks for the pointer.
>
> However, then my next confusion is why they can't provide a common
> solution to the smp code again... I feel like it could be even easier
> (please see below). I'm not very familiar with smp code yet, but if
> it works it should benefit all callers imho.

Ah, so going to sleep on wait_for_completion() is _much_ more expensive
than a short spin. So it all depends on the expected behaviour of the
IPI I suppose.

In general we expect these IPIs to be 'quick'.

Also, as is, you're allowed to use the smp_call_function*() family with
preemption disabled, which pretty much precludes using
wait_for_completion().