Re: [Fastboot] Re: kdump on non-boot cpu

From: Eric W. Biederman
Date: Thu Feb 03 2005 - 21:52:40 EST


Itsuro Oda <oda@xxxxxxxxxxxxx> writes:

> Hi,
>
> On 03 Feb 2005 02:58:02 -0700
> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:
>
> > Itsuro Oda <oda@xxxxxxxxxxxxx> writes:
> >
> > > Hi,
> > >
> > > This is not for kdump but an experience of our project(mkdump).
> > > The dump kernel(not SMP config) boot hangs if machine_kexec()
> > > excutes on non-boot CPU on x86_64 platform.
> >
> > ?? x86_64 is Opteron cpu, amd64, Intel cpu?
> > Are the kernels running in 32bit or 64bit mode. I'm guessing
> > SMP Opterons running in 32bit mode.
>
> SMP Opterons running in 64bit mode. (The normal kernel is SMP configed.
> The dump kernel is not SMP configed.)

The reason I was asking and assuming you had a 32bit kernel is that
you were quoting pieces of arch/i386/kernel/crash.c instead of
arch/x86_64/kernel/crash.c

While the functionality can easily be transfered I didn't think
any one had done that yet.

> > Anyway one thing I want to do is actually drop the apic shutdown
> > code altogether in this code path.
>
> It sounds nice. (if available)

All that has to happen is to drop the x86-crash_shutdown-apic-shutdown.patch
from Andrews tree. That patch just added the handful of lines that
disabled the apics. Once I get a test case that someone can boot a
kernel without disabling apics I will ask Andrew to drop the above
mentioned patch.

> > My best hunch is that your UP kernel is not getting interrupts.
>
> yes. It seems the timer interrupts is not comming up.
>
> > Any chance on getting a serial console boot log?
> >
> attached. both success case and unsuccess case.

Thanks.

> > I suspect it can be made to work if you compile your UP
> > kernel with IOAPIC support. I do know the table parsers
> > no longer complain about the configuration.
>
> with IOAPIC support. the config is attached.

Ok. Thanks. This is a legitimate bug. And it is probably the reason
I even care about the non-SMP interrupt case some days. The problem
is that the kernel just assumes interrupts are setup in non-APIC mode
when it starts booting, and quite possibly only the bootstrap cpu can
see those interrupts.

So I believe the fix needs to be to enable apics before we calibrate
the delay timer. I'm not certain off the top of my head what that
patch will look like but it should not be fundamentally hard.
With that code in place we also don't need to do any APIC shutdown
as the kernel knows enough to completely setup the apics.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/