Re: Simple script that locks up my box with recent kernels

From: Jesper Juhl
Date: Mon Oct 16 2006 - 19:13:30 EST


On 17/10/06, Linus Torvalds <torvalds@xxxxxxxx> wrote:


On Tue, 17 Oct 2006, Jesper Juhl wrote:
>
> Ok, finally got to the end of the bisection (see below; quoting all of
> my previous email since my concerns from that one are still valid).

Ok. It does smell like you marked somethign good that wasn't. That commit
1db27c11 was the last one you claimed was bad, of course, so it's the one
git will claim caused it, when you've marked its parent good.

> Where do I go from here? The problem is still there... I'll test
> 2.6.19-rc2 tomorrow, but apart from that I don't know how to proceed
> apart from trying to capture a sysrq+t dump when the box locks up...
> any ideas?

Yeah, trying to do sysrq when it locks is probably worth it. As is
enabling debugging things (netconsole, page-alloc, slab alloc, lockdep
etc).

I've got all those debug options (and more) enabled already in all the
bisection builds. I run with those options enabled most of the time
and I didn't change my config for any of the kernels I tested (except
for running 'make oldconfig').
netconsole is not much use to me as I don't have a second box at the
moment to capture output on :-( So the best I can do there is to let
the box run in a plain console with the test script and then press
sysrq+t when it locks and take a photo of the output (or whats left of
it on the screen) if any.


But if nothing seems to really give any clues, you might just try
to restart bisection with

git bisect reset
git bisect start
git bisect good v2.6.17
git bisect bad 1db27c11

and just run the resulting kernel version for a day or two. If an hour
wasn't really good enough, it's not as repeatable as we'd have wished, but
even if it takes a few days to narrow it down by just two bisections or
so, it will cut things down from ten thousand commits to "just" 2500..

Ok, sure. I'll do a days run of 2.6.19-rc2 first, just to see if it's
been fixed in the mean time. If it's still there I'll try to get a
sysrq+t and post that, then I'll restart bisection and give each
kernel a full 24hrs of testing before concluding it is good.

I'll report back as soon as I have some results.

--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/