Re: [RESEND 1/1] gic: its: Make sure a LPI is discarded before free.

From: Marc Zyngier
Date: Thu Jan 10 2019 - 07:55:48 EST


On 10/01/2019 09:42, Zhao, Yuanyuan wrote:
>
>
>> -----Original Message-----
>> From: Marc Zyngier [mailto:marc.zyngier@xxxxxxx]
>> Sent: 2019年1月9日 17:52
>> To: Zhao, Yuanyuan <yuanyuan.zhao@xxxxxxxxxxxxxxxx>
>> Cc: tglx@xxxxxxxxxxxxx; jason@xxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; Zheng, Joey
>> <yu.zheng@xxxxxxxxxxxxxxxx>; Wang, Dongsheng <dongsheng.wang@hxt-
>> semitech.com>
>> Subject: Re: [RESEND 1/1] gic: its: Make sure a LPI is discarded before free.
>>
>> On 09/01/2019 09:29, Zhao, Yuanyuan wrote:
>>> Hi Marc:
>>>
>>> Thank you for your reply.
>>>
>>> As you said, APIs such as free_irq will deactivate irq before free it.
>>> But deactivation is not forced by every API, for example
>>> irq_dispose_mapping. So I think it's better to check that irq was
>>> deactivated as expected.
>>
>> In general, we should fix the problem at the core API level instead of hacking
>> individual drivers.
>>
>> But more to the point, irq_dispose_mapping is not supposed to do anything
>> with the an active irq, as it doesn't have the required information to safely
>> remove it.
>>
>> So calling irq_dispose_mapping on an interrupt that still has registered
>> actions is a bug, and I'm not convinced we want to cater for such a case. Do
>> you have a concrete example of some kernel code expecting this behaviour?
>>
>> Thanks,
>>
>> M.
>>
>
> Most driver use free_irq after register actions, I found this problem by a test case.

Right, so that's not a fix at all. Not following the API is the bug.

> But if this problem happen and the same DeviceID & EventID are reused,
> the freed ITT will be visit which cause delayed kernel panic,
> the prev INTs are triggered unexpected, but the new INTs lost.

If you're disposing of the mapping and yet have not freed the interrupt,
then anything can happen. The kernel doesn't try to protect yourself
from your own mistakes.

> So I think this check spend less, but gains more.

I beg to differ. This brings nothing but a very weak guarantee that only
applies to a particular class of HW, makes a mess of the API, and papers
over grave bugs that should be fixed.

Thanks,

M.
--
Jazz is not dead. It just smells funny...