Re: [PATCH v2] kexec: Fix kdump failure with notsc

From: Wei, Jiangang
Date: Wed Jul 13 2016 - 03:47:09 EST


On Mon, 2016-07-11 at 18:28 +0800, Wei Jiangang wrote:
> Hi , Ingo
>
> On Fri, 2016-07-08 at 09:38 +0200, Ingo Molnar wrote:
> > * Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> >
> > > Sigh. Can we please just do the work to rip out the apic shutdown code from the
> > > kexec on panic code path?
> > >
> > > I forgetting details but the only reason we have do any apic shutdown is bugs in
> > > older kernels that could not initialize a system properly if we did not shut
> > > down the apics.
> > >
> > > I certainly don't see an issue with goofy cases like notsc not working on a
> > > crash capture kernel if we are not initializing the hardware properly.
> > >
> > > The strategy really needs to be to only do the absolutely essential hardware
> > > shutdown in the crashing kernel, every adintional line of code we execute in the
> > > crashing kernel increases our chances of hitting a bug.
> >
> > Fully agreed.
> >
> > > Under that policy things like requring we don't pass boot options that inhibit
> > > the dump catpure kernel from initializing the hardware from a random state are
> > > reasonable requirements. AKA I don't see any justification in this as to why we
> > > would even want to support notsc on the dump capture kernel. Especially when
> > > things clearly work when that option is not specified.
> >
> > So at least on the surface it appears 'surprising' that the 'notsc' option (which,
> > supposedly, disables TSC handling) interferes with being able to fully boot. Even
> > if 'notsc' is specified we are still using the local APIC, right?
>
> In most case, It's no problem that using local APIC while notsc is
> specified.
> But not for kdump.
>
> We can get evidence, Especially from "Spurious LAPIC timer interrupt on
> cpu 0".
>
> ###serial log,
>
> [ 0.000000] NR_IRQS:524544 nr_irqs:256 16
> [ 0.000000] Spurious LAPIC timer interrupt on cpu 0
> [ 0.000000] Console: colour dummy device 80x25
> [ 0.000000] console [tty0] enabled
> [ 0.000000] console [ttyS0] enabled
> [ 0.000000] tsc: Fast TSC calibration using PIT
> [ 0.000000] tsc: Detected 2099.947 MHz processor
> [ 0.000000] Calibrating delay loop...
>
>
> Due to the local apic and local apic timer hasn't been setup and enabled
> fully, The event_handler of clock event is NULL.
>
> ###codes,
>
> static void local_apic_timer_interrupt(void)
> {
> int cpu = smp_processor_id();
> struct clock_event_device *evt = &per_cpu(lapic_events, cpu);
>
> /*
> * Normally we should not be here till LAPIC has been initialized
> but
> * in some cases like kdump, its possible that there is a pending
> LAPIC
> * timer interrupt from previous kernel's context and is delivered
> in
> * new kernel the moment interrupts are enabled.
> *
> * Interrupts are enabled early and LAPIC is setup much later, hence
> * its possible that when we get here evt->event_handler is NULL.
> * Check for event_handler being NULL and discard the interrupt as
> * spurious.
> */
> if (!evt->event_handler) {
> pr_warning("Spurious LAPIC timer interrupt on cpu %d\n", cpu);
> /* Switch it off */
> lapic_timer_shutdown(evt);
> return;
> }
>
> .............
> }
>
>
> IMHO,
> If we specify notsc, the dump-capture kernel waits for jiffies being
> updated early and LAPIC and timer are setup much later, which causes no
> timer interrupts is passed to BSP. as following,
>
> start_kernel -->
> 1)-> calibrate_delay() -> calibrate_delay_converge() # hang and wait
> for jiffies changed
>
> 2)-> rest_init() -> kernel_init() -> .... -> apic_bsp_setup() ->
> setup_local_APIC()
>
> -> setup_percpu_clockev().
>
> the setup_percpu_clockev points setup_boot_APIC_clock() which used to
> setup the boot APIC and timer.
>
>
> > So it might be a good idea to find the root cause of this bootup fragility even if
> > 'notsc' is specified. And I fully agree that it should be fixed in the bootup path
> > of the dump kernel, not the crash kernel reboot path.
>

Can anyone give some advice or commet on the following ideaï
Thanks in advance.

> Because the lapic and timer are not ready when dump-capture waits them
> to update the jiffies value. so I suggest to put APIC in legacy mode in
> local_apic_timer_interrupt() temporarily, which in the bootup path of
> dump kernel.
>
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index dcb52850a28f..af3be93997ed 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -879,6 +879,7 @@ static void local_apic_timer_interrupt(void)
> pr_warning("Spurious LAPIC timer interrupt on cpu %d\n",
> cpu);
> /* Switch it off */
> lapic_timer_shutdown(evt);
> + disable_IO_APIC();
> return;
> }
>
> And the new solution can fix the problem.
> Whatâs your opinion about it?
>
> Thanks,
> wei
>
> >
> > Thanks,
> >
> > Ingo
> >
> >
>