Re: 2.6.17-rc5-mm1

From: Martin J. Bligh
Date: Wed May 31 2006 - 17:52:35 EST


Andrew Morton wrote:
"Martin J. Bligh" <mbligh@xxxxxxxxxx> wrote:

Andrew Morton wrote:

Martin Bligh <mbligh@xxxxxxxxxx> wrote:


The x86_65 panic in LTP has changed a bit. Looks more useful now.
Possibly just unrelated new stuff. Possibly we got lucky.

What are you doing to make this happen?

runalltests on LTP



We have to get to the bottom of this - there's a shadow over about 500
patches and we don't know which.

We did do one chop, and concluded it wasn't the x86_64 patches.

iirc I tried to reproduce this a couple of weeks back and failed.

It looks like a different panic to me. It was a double-fault before.

Are you able to narrow it down to a particular LTP test? It was mtest01 or
something like that? Perhaps we can identify a particular command line
which triggers the fault in a standalone fashion?

I can't do much from here - it's running on an IBM machine. Have to beg
Andy, or one of the other IBMers, for help.

And why can't I make it happen? Perhaps it's a memory initialisation
problem, and it only happens to hit in that stage of LTP because that's
when you started doing page reclaim, or something?

It consistently happens on -mm, and not mainline, flicking back and forth over time. So if you mean h/w mem init, I don't think so. if you
mena some patch in -mm, then yes.

Perhaps just try putting a heap of memory pressure on the machine,
> see what that does?

Yes, the other stuff might not be swapping.

Being unable to reproduce it and not having a theory to go on leaves us
kinda stuck. Help, please?

Yeah, we have a sniff-testing mechanism that works well. However,
drill-down still requires significant amounts of human intervention.
The next gen of stuff should help do more intelligent stuff, but we're
kind of stuck with human-ness for now.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/