Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot

From: Guilherme G. Piccoli
Date: Mon Oct 22 2018 - 15:44:18 EST


On 18/10/2018 17:30, Sinan Kaya wrote:
>
> AFAIK, all shutdown (not remove) routines are called before launching
> the next
> kernel even in crash scenario. It is not safe to start the new kernel while
> hardware is doing a DMA to the system memory and triggering interrupts.

Hi Sinan,

I agree with you, it's definitely not safe to start a new kernel with
in-flight DMA transactions, but in the crash scenario I think the
rationale was that running kernel is broken so it's even more unreliable
to try gracefully shutdown the devices than hope-for-the-best and start
the kdump kernel right away heheh

Fact is that the shutdown handlers are not called in the crash scenario.
They come from device_shutdown(), the code paths are as follow:

Regular kexec flow:

syscall_reboot()
kernel_kexec()
kernel_restart_prepare()
device_shutdown()
machine_kexec()

Although if CONFIG_KEXEC_JUMP is set, it doesn't call device_shutdown()
either.


Crash kexec flow:
__crash_kexec()
machine_kexec()

There are some entry points to __crash_kexec(), like panic() or die() in
x86, for example.
To validate this, one can load a kernel with "initcall_debug" parameter,
and performs a kexec - if the shutdown handlers are called, there's a
dev_info() call that shows a message per device.


> Shutdown routine in PCI core used to disable MSI/MSI-x on behalf of all
> endpoints but it was later decided that this is the responsibility of the
> endpoint driver.
>

This may be a good idea, using the pci layer to disable MSIs in the
quiesce path of the broken kernel. I'll follow-up this discussion in
Bjorn's reply.

Thanks,


Guilherme