RE: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
From: Michael Kelley
Date: Thu Mar 19 2020 - 11:15:12 EST
From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx> Sent: Thursday, March 19, 2020 7:08 AM
> >>
> >> This patchset fixes some issues in the Hyper-V panic code path.
> >> Patch 1 resolves issue that panic system still responses network
> >> packets.
> >> Patch 2-3 resolves crash enlightenment issues.
> >> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
> >> VM in order to report crash data or kmsg to host before running
> >> kdump kernel.
> >
> > I still see an issue that isn't addressed by these patches. The VMbus
> > driver registers a "die notifier" and a "panic notifier". But die() will
> > eventually call panic() if panic_on_oops is set (which I think it typically
> > is). If the CRASH_NOTIFY_MSG option is *not* enabled, then
> > hyperv_report_panic() could get called by the die notifier, and then
> > again by the panic notifier.
> >
> > Do we even need the "die notifier"? If it was removed, there would
> > not be any notification to Hyper-V via the die() path unless panic_on_oops
> > is set, which I think is actually the correct behavior. I'm not
> > completely clear on what is supposed to happen in general to the
> > Linux kernel if panic_on_oops is not set. Does it try to continue to run?
> > If so, then we should not be notifying Hyper-V if panic_on_oops is not
> > set, and removing the die notifier is the right thing to do.
> >
>
> hyperv_report_panic() has re-enter check inside and so kernel only
> reports crash register data once during die().
Ah, yes, you are right.
> From comment in the
> hyperv_report_panic(), register value reported in die chain is more
> exact than value in panic chain. The register value in die chain is
> passed by die() caller. Register value reported in panic chain
> is collected in the hyperv_panic_event().
>
> If panic_on_oops is not set, the task should be killed and kernel
> still runs. In this case, we may not trigger crash enlightenment.
I'm not completely clear on your last statement. It seems like there
is still a problem in that die() will call hyperv_report_panic() even if
panic_on_oops is not set. We will have reported a panic to Hyper-V
even though the VM did not stop running.
Michael