Re: [PATCH] smp: Allow smp_call_function_single_async() to insert locked csd

From: Peter Xu
Date: Mon Dec 16 2019 - 15:58:41 EST


On Mon, Dec 16, 2019 at 09:37:05PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 11, 2019 at 11:29:25AM -0500, Peter Xu wrote:
> > This is also true.
> >
> > Here's the statistics I mentioned:
> >
> > =================================================
> >
> > (1) Implemented the same counter mechanism on the caller's:
> >
> > *** arch/mips/kernel/smp.c:
> > tick_broadcast[713] smp_call_function_single_async(cpu, csd);
> > *** drivers/cpuidle/coupled.c:
> > cpuidle_coupled_poke[336] smp_call_function_single_async(cpu, csd);
> > *** kernel/sched/core.c:
> > hrtick_start[298] smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);
> >
> > (2) Cleared the csd flags before calls:
> >
> > *** arch/s390/pci/pci_irq.c:
> > zpci_handle_fallback_irq[185] smp_call_function_single_async(cpu, &cpu_data->csd);
> > *** block/blk-mq.c:
> > __blk_mq_complete_request[622] smp_call_function_single_async(ctx->cpu, &rq->csd);
> > *** block/blk-softirq.c:
> > raise_blk_irq[70] smp_call_function_single_async(cpu, data);
> > *** drivers/net/ethernet/cavium/liquidio/lio_core.c:
> > liquidio_napi_drv_callback[735] smp_call_function_single_async(droq->cpu_id, csd);
> >
> > (3) Others:
> >
> > *** arch/mips/kernel/process.c:
> > raise_backtrace[713] smp_call_function_single_async(cpu, csd);
>
> per-cpu csd data, seems perfectly fine usage.

I'm not sure whether I get the point, I just feel like it could still
trigger as long as we do it super fast, before IPI handled,
disregarding whether it's per-cpu csd or not.

>
> > *** arch/x86/kernel/cpuid.c:
> > cpuid_read[85] err = smp_call_function_single_async(cpu, &csd);
> > *** arch/x86/lib/msr-smp.c:
> > rdmsr_safe_on_cpu[182] err = smp_call_function_single_async(cpu, &csd);
>
> These two have csd on stack and wait with a completion. seems fine.

Yeh this is true, then I'm confused why they don't use the sync()
helpers..

>
> > *** include/linux/smp.h:
> > bool[60] int smp_call_function_single_async(int cpu, call_single_data_t *csd);
>
> this is the declaration, your grep went funny
>
> > *** kernel/debug/debug_core.c:
> > kgdb_roundup_cpus[272] ret = smp_call_function_single_async(cpu, csd);
> > *** net/core/dev.c:
> > net_rps_send_ipi[5818] smp_call_function_single_async(remsd->cpu, &remsd->csd);
>
> Both percpu again.
>
> >
> > =================================================
> >
> > For (1): These probably justify more on that we might want a patch
> > like this to avoid reimplementing it everywhere.
>
> I can't quite parse that, but if you're saying we should fix the
> callers, then I agree.
>
> > For (2): If I read it right, smp_call_function_single_async() is the
> > only place where we take a call_single_data_t structure
> > rather than the (smp_call_func_t, void *) tuple.
>
> That's on purpose; by supplying csd we allow explicit concurrency. If
> you do as proposed here:
>
> > I could
> > miss something important, but otherwise I think it would be
> > good to use the tuple for smp_call_function_single_async() as
> > well, then we move call_single_data_t out of global header
> > but move into smp.c to avoid callers from toucing it (which
> > could be error-prone). In other words, IMHO it would be good
> > to have all these callers fixed.
>
> Then you could only ever have 1 of then in flight at the same time.
> Which would break things.

Sorry, I think you are right.

>
> > For (3): I didn't dig, but I think some of them (or future users)
> > could still suffer from the same issue on retriggering the
> > WARN_ON...
>
> They all seem fine.
>
> So I'm thinking your patch is good, but please also fix all 1).

Sure. Thanks,

--
Peter Xu