Re: 2.6.25-mm1: not looking good

From: Vegard Nossum
Date: Fri Apr 18 2008 - 09:23:33 EST


On Fri, Apr 18, 2008 at 3:02 PM, Jason Wessel
<jason.wessel@xxxxxxxxxxxxx> wrote:
> Vegard Nossum wrote:
> > On Fri, Apr 18, 2008 at 2:34 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> >> * Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> >>
> >> > With the patch below, it seems 100% reproducible to me (7 out of 7
> >> > bootups hung).
> >> >
> >> > The number of loops it could do before hanging were, in order: 697,
> >> > 898, 237, 55, 45, 92, 59
> >>
> >> cool! Jason: i think that particular self-test should be repeated 1000
> >> times before reporting success ;-)
> >>
> >
> > BTW, I just tested a 32-bit config and it hung after 55 iterations as well.
> >
> > Vegard
> >
> >
> >
> I assume this was SMP?

Yes. But now that I realize this, I tried running same kernel with
qemu, using -smp 16, and it seems to be stuck here:

[ 16.562659] kgdb: Registered I/O driver kgdbts.
[ 16.565875] kgdbts:RUN plant and detach test

and the code is at kgdb_handle_exception():

/*
* Wait for the other CPUs to be notified and be waiting for us:
*/
for_each_online_cpu(i) {
while (!atomic_read(&cpu_in_kgdb[i]))
cpu_relax();
}


>
> While I had not tried it yet, my guess would have been this did not
> happen on a UP kernel. If it does occur on a UP kernel it means the
> problem is squarely between the task scheduling after the exception is
> handled and the kgdb state logic for re-entering the debug state after a
> single step exception occurs.
>
> It seems reasonable to go for 1000 iterations of this particular test to
> declare success as pointed out by Ingo. Previous versions of kgdb
> handled some of the irq + single step + cpu sync slightly differently
> and it is entirely possible there is a regression there.
>
> Jason.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/