Re: Page aging broken in 2.6

From: Roger Luethi
Date: Sun Dec 28 2003 - 07:12:01 EST


On Sat, 27 Dec 2003 16:04:10 -0800, Andrew Morton wrote:
> > Having all processes blocked is indeed one problem of 2.6 under memory
> > pressure. I don't know what the cause is, though.
>
> I usually work this sort of thing out by "random sampling". When
> everything is in steady state, break into kgdb and start looking at task
> backtraces, see where they are all sleeping.

Well, there isn't really a steady state as such. On a loaded system
there are periods during compile benchmarks where the system spends
half the time and more in I/O wait, so some processes do get to run
and do some minimal amount of work.

> If it's in the pagefault handler, go up to do_page_fault() and work out the
> faulting address. Compare that with /proc/pid/maps to see if it's libc or
> whatever.
>
> Repeat the above N times until you have a decent feel for what's happening
> in there. It doesn't take long.

I instrumented the kernel a while ago to log page fault handling
(address, backing file if available) when the system became idle with
all processes blocked. I can resurrect that code which would allow for
larger samples. I'll post results if/when I get around to do it.

Roger
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/