Re: [PATCH 0/2] jump label: 2.6.38 updates
From: Paul E. McKenney
Date: Mon Feb 14 2011 - 19:43:18 EST
On Mon, Feb 14, 2011 at 06:29:47PM -0500, Mathieu Desnoyers wrote:
> [ added Segher Boessenkool and Paul Mackerras to CC list ]
>
> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> > On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote:
> > > * Matt Fleming (matt@xxxxxxxxxxxxxxxxx) wrote:
> > > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST)
> > > > David Miller <davem@xxxxxxxxxxxxx> wrote:
> > > >
> > > > > From: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > > > > Date: Mon, 14 Feb 2011 16:39:36 -0500
> > > > >
> > > > > > Thus it is not about global, as global is updated by normal means
> > > > > > and will update the caches. atomic_t is updated via the ll/sc that
> > > > > > ignores the cache and causes all this to break down. IOW... broken
> > > > > > hardware ;)
> > > > >
> > > > > I don't see how cache coherency can possibly work if the hardware
> > > > > behaves this way.
> > > >
> > > > Cache coherency is still maintained provided writes/reads both go
> > > > through the cache ;-)
> > > >
> > > > The problem is that for read-modify-write operations the arbitration
> > > > logic that decides who "wins" and is allowed to actually perform the
> > > > write, assuming two or more CPUs are competing for a single memory
> > > > address, is not implemented in the cache controller, I think. I'm not a
> > > > hardware engineer and I never understood how the arbitration logic
> > > > worked but I'm guessing that's the reason that the ll/sc instructions
> > > > bypass the cache.
> > > >
> > > > Which is why the atomic_t functions worked out really well for that
> > > > arch, such that any accesses to an atomic_t * had to go through the
> > > > wrapper functions.
> >
> > ???
> >
> > What CPU family are we talking about here? For cache coherent CPUs,
> > cache coherence really is supposed to work, even for mixed atomic and
> > non-atomic instructions to the same variable.
> >
>
> I'm really curious to know which CPU families too. I've used git blame
> to see where these lwz/stw instructions were added to powerpc, and it
> points to:
But lwz and stw instructions are normal non-atomic PowerPC loads and
stores. No LL/SC -- those would instead be lwarx and stwcx.
Thanx, Paul
> commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89
> Author: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>
> Date: Sat Aug 11 10:15:30 2007 +1000
>
> [POWERPC] Implement atomic{, 64}_{read, write}() without volatile
>
> Instead, use asm() like all other atomic operations already do.
>
> Also use inline functions instead of macros; this actually
> improves code generation (some code becomes a little smaller,
> probably because of improved alias information -- just a few
> hundred bytes total on a default kernel build, nothing shocking).
>
> Signed-off-by: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Paul Mackerras <paulus@xxxxxxxxx>
>
> So let's ping the relevant people to see if there was any reason for
> making these atomic read/set operations different from other
> architectures in the first place.
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/