Re: page fault scalability patch V11 [0/7]: overview

From: Robin Holt
Date: Fri Nov 19 2004 - 22:41:42 EST


On Fri, Nov 19, 2004 at 07:04:25PM -0800, William Lee Irwin III wrote:
> Why the Hell would you bother giving each cpu a separate cacheline?
> The odds of bouncing significantly merely amongst the counters are not
> particularly high.

Agree, we are currently using atomic ops on a global rss on our 2.4
kernel with 512cpu systems and not seeing much cacheline contention.
I don't remember how little it ended up being, but it was very little.
We had gone to dropping the page_table_lock and only reaquiring it if
the pte was non-null when we went to insert our new one. I think that
was how we had it working. I would have to wake up and actually look
at that code as it was many months ago that Ray Bryant did that work.
We did make rss atomic. Most of the contention is sorted out by the
mmap_sem. Processes acquiring themselves off of mmap_sem were found
to have spaced themselves out enough that they were all approximately
equal time from doing their atomic_add and therefore had very little
contention for the cacheline. At least it was not enough that we could
measure it as significant.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/