Re: [PATCH RFC] percpu: add data dependency barrier in percpu accessors and operations
From: Christoph Lameter
Date: Tue Jul 15 2014 - 12:12:31 EST
On Tue, 15 Jul 2014, Linus Torvalds wrote:
> Really, "before" and "after" have ABSOLUTELY NO MEANING unless you
> have a barrier. And you're arguing against those barriers. So you
> cannot use "before" as an argument, since in your world, no such thing
> even exists!
I mentioned that there is a barrier because the process of handing over
the offset to the other includes synchronization. In the slab case this is
a semaphore that is use to protect the structure and the list of
kmem_cache structures. The control struct containing the offset must be
entered somehow into something that tracks it for the future and thus
there is synchronization by the subsytem.
> > There are other arguments, but they basically boil down to "no other
> CPU ever accesses the per-cpu data of *this* CPU" (wrong) or "the
> users will do their own barriers" (maybe true, maybe not). Your "value
> is only available after" argument really isn't an argument. Not
> without those barriers.
Ok so what is happening is:
1. cacheline is zeroed on per_cpu_alloc but still exists in remote processor.
(we could actually insert code in alloc_percpu to ensure that the remote
caches are cleaned and not proceed unless that is complete. allocpercpu
is not performance critical).
2. cacheline is initialized with new values by the subsystem looping over
all percpu instances. Other processor still keeps the old data.
3. mutex is taken, list modifications occur, mutex is released. Remote
processor still keeps the old cacheline data.
4. Subsystem makes the percpu offset available.
5. The remote processor is processing using its instance of the per cpu
data for the first time using the offset to determine the percpu data for
its data. This typically means its updating the cacheline (and we hope
that the cacheline will be in exclusive state for good for performance reasons).
And now we still see the old data. The cacheline changes of the initial
processor are ignored?
Ok if this is the case then we have another way of dealing with this in
alloc_percpu. Either zap the relevant remote cpu caches after the areas
were zeroed or do an IPI to make the remote processor run the percpu area
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/