Re: page fault scalability patch V11 [0/7]: overview

From: William Lee Irwin III
Date: Sat Nov 20 2004 - 14:09:35 EST


On Sat, Nov 20, 2004 at 09:14:11AM -0800, Linus Torvalds wrote:
> I will pretty much guarantee that if you put the per-thread patches next
> to some abomination with per-cpu allocation for each mm, the choice will
> be clear. Especially if the per-cpu/per-mm thing tries to avoid false
> cacheline sharing, which sounds really "interesting" in itself.
> And without the cacheline sharing avoidance, what's the point of this
> again? It sure wasn't to make the code simpler. It was about performance
> and scalability.

"The perfect is the enemy of the good."

The "perfect" cacheline separation achieved that way is at the cost of
destabilizing the kernel. The dense per-cpu business is only really a
concession to the notion that the counter needs to be split up at all,
which has never been demonstrated with performance measurements. In fact,
Robin Holt has performance measurements demonstrating the opposite.

The "good" alternatives are negligibly different wrt. performance, and
don't carry the high cost of rwlock starvation that breaks boxen.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/