Re: [cpuops cmpxchg V2 3/5] irq_work: Use per cpu atomics insteadof regular atomics

From: H. Peter Anvin
Date: Wed Dec 15 2010 - 12:33:02 EST


On 12/15/2010 09:18 AM, Peter Zijlstra wrote:
> On Wed, 2010-12-15 at 11:04 -0600, Christoph Lameter wrote:
>
>> Prefixes are faster than explicit address calculations. A prefix allows
>> you to integrate the per cpu address calculation into an arithmetic
>> operation.
>
> Well, depends on how often you need that address I'd think. If you'd
> have a per-cpu struct and need to frob lots of variables in that struct
> it might be cheaper to simply compute the struct address once and then
> use relative addresses than to prefix everything with %fs.
>

Let's just make it clear -- current x86 CPUs generally do not have a
penalty for prefixes (it might be that under very unusual pipeline
conditions they do, I am not 100% sure.) In fact, we changed patching
LOCK prefixes from NOP to %ds: because it made the code faster.

Some older CPUs do, but those are no longer relevant for performance
decisions.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/