Re: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove single ipi fallback for smp_call_function_many())

From: Nick Piggin
Date: Wed Feb 18 2009 - 09:00:00 EST


On Tue, Feb 17, 2009 at 06:21:42PM -0800, Suresh B wrote:
> On Tue, 2009-02-17 at 03:26 -0800, Nick Piggin wrote:
> > --
> > Simplify the barriers in generic remote function call interrupt code.
> >
> > Firstly, just unconditionally take the lock and check the list in the
> > generic_call_function_single_interrupt IPI handler. As we've just taken
> > an IPI here, the chances are fairly high that there will be work on the
> > list for us, so do the locking unconditionally. This removes the tricky
> > lockless list_empty check and dubious barriers. The change looks bigger
> > than it is because it is just removing an outer loop.
> >
> > Secondly, clarify architecture specific IPI locking rules. Generic code
> > has no tools to impose any sane ordering on IPIs if they go outside
> > normal cache coherency, ergo the arch code must make them appear to
> > obey cache coherency as a "memory operation" to initiate an IPI, and
> > a "memory operation" to receive one. This way at least they can be
> > reasoned about in generic code, and smp_mb used to provide ordering.
> >
> > The combination of these two changes means that explict barriers can
> > be taken out of queue handling for the single case -- shared data is
> > explicitly locked, and ipi ordering must conform to that, so no
> > barriers needed. An extra barrier is needed in the many handler, so
> > as to ensure we load the list element after the IPI is received.
> >
> > Does any architecture actually needs barriers? For the initiator I
> > could see it, but for the handler I would be surprised. The other
> > thing we could do for simplicity is just to require that a full
> > barrier is required before generating an IPI, and after receiving an
> > IPI. We can't just do that in generic code without auditing
> > architectures. There have been subtle hangs here on some archs in
> > the past.
>
> x2apic register reads/writes don't have serializing semantics, as
> opposed to uncached xapic accesses, which are inherently serializing.
>
> With this patch, we need to fix the corresponding x2apic IPI operations.
> I will take a look at it.

You're saying the problem is in generic_exec_single because I've
removed the smp_mb that inadvertently also serialises memory with
the x2apic on x86?

Indeed that could cause problems on some architectures which I
had hoped to avoid. So the patch is probably better off to first
add the smp_mb() to arch_send_call_function_xxx arch code, unless
it is immediately obvious or confirmed by arch maintainer that
such barrier is not required.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/