RE: [PATCH] Only allow to set crash_kexec_post_notifiers on boot time
From: Michael Kelley
Date: Thu Sep 24 2020 - 12:43:09 EST
From: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> Sent: Thursday, September 24, 2020 9:26 AM
>
> Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:
>
> >> >
> >> > Added Hyper-V people and people who created the param, it is below
> >> > commit, I also want to remove it if possible, let's see how people
> >> > think, but the least way should be to disable the auto setting in both systemd
> >> > and kernel:
> >
> > Hyper-V uses a notifier to inform the host system that a Linux VM has
> > panic'ed. Informing the host is particularly important in a public cloud
> > such as Azure so that the cloud software can alert the customer, and can
> > track cloud-wide reliability statistics. Whether a kdump is taken is controlled
> > entirely by the customer and how he configures the VM, and we want
> > the host to be informed either way.
>
> Why?
>
> Why does the host care?
> Especially if the VM continues executing into a kdump kernel?
The host itself doesn't care. But the host is a convenient out-of-band
channel for recording that a panic has occurred and to collect basic data
about the panic. This out-of-band channel is then used to notify the end
customer that his VM has panic'ed. Sure, the customer should be running
his own monitoring software, but customers don't always do what they
should. Equally important, the out-of-band channel allows the cloud
infrastructure software to notice trends, such as that the rate of Linux
panics has increased, and that perhaps there is a cloud problem that
should be investigated.
>
> Further like I have mentioned everytime something like this has come up
> a call on the kexec on panic code path should be a direct call (That can
> be audited) not something hidden in a notifier call chain (which can not).
>
The use case I describe has no particular requirement that it be
implemented via the notifier call chain. If there's a better way to run
some out-of-band notification code on all Linux panics regardless of
whether a kdump is taken, we're open to such an alternative.
Michael