Re: [PATCH 0/3] irqchip: GIC kexec/kdump improvement and workarounds

From: Marc Zyngier
Date: Wed Mar 14 2018 - 13:42:14 EST


On 14/03/18 17:11, Thomas Gleixner wrote:
> On Wed, 14 Mar 2018, Mark Rutland wrote:
>> On Tue, Mar 13, 2018 at 06:35:07PM +0000, Marc Zyngier wrote:
>>> On 13/03/18 17:51, Mark Rutland wrote:
>>>> On Tue, Mar 13, 2018 at 05:21:00PM +0000, Marc Zyngier wrote:
>>>>> As kexec and kdump are getting used a bit more intensively, I've been
>>>>> made aware of a number of shortcomings.
>>>>>
>>>>> The main gripe is from folks trying to launch a kdump kernel from
>>>>> within an interrupt handler. If using EOImode==1, things work as
>>>>> expected. If using EOImode==0 (such as in a guest), the secondary
>>>>> kernel hangs as the previous interrupt hasn't been EOI'd, and the
>>>>> active priority is still set. The first two patches are addressing
>>>>> this situation for both GICv2 and GICv3 by reseting the APRs to their
>>>>> default value.
>>>>
>>>> As a more general thing, if irqchip drivers have state that needs to be
>>>> reset in their init code, can we live all this irqchip reset to the
>>>> crashdump kernel, and kill machine_kexec_mask_interrupts() entirely?
>>>
>>> We could, once we know for sure that all the potential irqchips have
>>> been fixed. Or we could just remove it immediately, and see what breaks.
>>
>> I would be very tempted to do the latter.
>
> Makes sense. Do we have any indicator that tells us that a particular irq
> chip is missing something in the init code or do we have to rely on crash
> reports?
A way to work out what is potentially missing would be to make sure that
whatever we're removing from machine_kexec_mask_interrupts, we can find
it in the irqchip init code. Not an easy task, and certainly not perfect
(patches 1 and 2 in this series have no equivalent in the kexec code).

There is still another category of "reset" stuff that belongs to the
teardown path, and that's for things that may have an impact on the
secondary kernel.

The case I have in mind is that of the GIC LPI pending tables. These are
allocated to the GIC, which can write pending bits at any time. Think of
it as a DMA engine. At the moment we enter the secondary kernel, we must
make sure the GIC has already been shut down, as the table memory will
be reallocated.

For that particular case, I've started looking at some "reset" API that
an irqchip to register with, and get called back on kexec/kdump. Not
completely dissimilar to the shutdown method that some IOMMU drivers use
to gracefully stop in the same circumstances.

Thanks,

M.
--
Jazz is not dead. It just smells funny...