Re: [tip:perfcounters/urgent] x86: atomic64: The atomic64_t datatype should be 8 bytes aligned on 32-bit too

From: Linus Torvalds
Date: Fri Jul 03 2009 - 13:00:03 EST




On Fri, 3 Jul 2009, tip-bot for Eric Dumazet wrote:
>
> x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too
>
> Locked instructions on two cache lines at once are painful. If
> atomic64_t uses two cache lines, my test program is 10x slower.
>
> The chance for that is significant: 4/32 or 12.5%.

Btw, the comments here are not strictly correct.

It's not necessarily even about "two cachelines". It's true that crossing
cachelines is extra painful, but from a CPU core angle, there's another
access width that matters almost as much, namely the width of the bus
between the core and the L1 cache. If it's not aligned to that, the core
needs to do each 8-byte read/write as two accesses, even if it's to the
same cacheline, and that complicates things.

The cacheline itself is generally larger than the cache access width. I
could easily see a 64B cacheline, but a 256b (32B) bus between the cache
and the core.

Making the atomics be naturally aligned means that you never cross either
one, of course.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/