Re: 2.5.36-mm1 dbench 512 profiles

From: Martin J. Bligh (
Date: Fri Sep 20 2002 - 11:07:03 EST

>> For a 32-way system fastwalk will perform badly from dcache_lock point of
>> view, basically due to increased lock hold time. dcache_rcu-12 should reduce
>> dcache_lock contention and hold time.
> Isn't increased hold time _good_ on NUMA-Q? I thought that the
> really costy operation was bouncing the lock around the interconnect,
> not holding it.

Depends what you get it return. The object of fastwalk was to stop the
cacheline bouncing on all the individual dentry counters, at the cost
of increased dcache_lock hold times. It's a tradeoff ... and in this
instance it wins. In general, long lock hold times are bad.

> Has fastwalk ever been tested on NUMA-Q?

Yes, in 2.4. Gave good results, I forget exactly what ... something
like 5-10% off kernel compile times.

> Remember when John Stultz tried MCS (fair) locks on NUMA-Q? They
> sucked because low hold times, which result from fairness, aren't
> efficient. It is actually faster to somewhat starve remote CPUs.

Nothing to do with low hold times - it's to do with bouncing the
lock between nodes.

> In any case, we all know often acquired global locks are a bad idea
> on a 32-way, and should be avoided like the plague. I just wish we
> had a dcache solution that didn't even need locks as much... :)

Well, avoiding data corruption is a preferable goal too. The point of
RCU is not to have to take a lock for the common read case. I'd expect
good results from it on the NUMA machines - never been benchmarked, as
far as I recall.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Sep 23 2002 - 22:00:31 EST