Re: [patch, 2.6.10-rc2] sched: fix ->nr_uninterruptible handling bugs

From: Ingo Molnar
Date: Wed Nov 17 2004 - 04:30:46 EST



* Linus Torvalds <torvalds@xxxxxxxx> wrote:

> On Wed, 17 Nov 2004, Ingo Molnar wrote:
> >
> > maybe, but why? Atomic ops are still a tad slower than normal ops
>
> A "tad" slower?
>
> An atomic op is pretty much as expensive as a spinlock/unlock pair on
> x86. Not _quite_, but it's pretty close.

it really depends on the layout of the data structure. The main cost
that we typically see combined with atomic ops and spinlocks is if the
target of the atomic op is a global/shared variable, in which case the
cacheline bounce cost is prohibitive and controls over any micro-cost.

But in this particular rq->nr_uninterruptible case we have per-CPU
variables, so the cost of the atomic op is, in theory, quite close to
the cost of a normal op.

In practice this means it's 10 cycles more expensive on P3-style CPUs.
(I think on P4 style CPUs it should be much closer to 0, but i havent
been able to reliably time it there - cycle measurements show a 76
cycles cost which is way out of line.)

On a UP Athlon64 the cost of a LOCK-ed op is exactly the same as without
the LOCK prefix - but here the CPU can take shortcuts because in theory
it can skip any cache coherency considerations altogether. (albeit the
atomic op should still have relevance to DMA-able data, perhaps UP-mode
CPUs ignore that case.) Also, i think it ought to be near zero-cost on
a Crusoe-style CPU?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/