Re: page fault scalability patch V11 [0/7]: overview

From: Nick Piggin
Date: Fri Nov 19 2004 - 23:40:21 EST


William Lee Irwin III wrote:
William Lee Irwin III wrote:

/proc/ triggering NMI oopses was a persistent problem even before that
code was merged. I've not bothered testing it as it at best aggravates it.


On Sat, Nov 20, 2004 at 03:03:17PM +1100, Nick Piggin wrote:

It isn't a problem. If it ever became a problem then we can just
touch the nmi oopser in the loop.


Very, very wrong. The tasklist scans hold the read side of the lock
and aren't even what's running with interrupts off. The contenders
on the write side are what the NMI oopser oopses.


*blinks*

So explain how this is "very very wrong", then?

And supposing the arch reenables interrupts in the write side's
spinloop, you just get a box that silently goes out of service for
extended periods of time, breaking cluster membership and more. The
NMI oopser is just the report of the problem, not the problem itself.
It's not a false report. The box is dead for > 5s at a time.


The point is, adding a for-each-thread loop or two in /proc isn't
going to cause a problem that isn't already there.

If you had zero for-each-thread loops then you might have a valid
complaint. Seeing as you have more than zero, with slim chances of
reducing that number, then there is no valid complaint.


William Lee Irwin III wrote:

And thread groups can share mm's. do_for_each_thread() won't suffice.


On Sat, Nov 20, 2004 at 03:03:17PM +1100, Nick Piggin wrote:

I think it will be just fine.


And that makes it wrong on both counts. The above fails any time
LD_ASSUME_KERNEL=2.4 is used, we well as when actual Linux features
are used directly.


See my followup.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/