Re: 5.11 new lockdep warning related to led-class code (also may involve ata / piix controller)

From: Hans de Goede
Date: Thu Jan 28 2021 - 08:04:08 EST


Hi,

On 1/27/21 11:01 PM, Pavel Machek wrote:
> Hi!
>
>>>>> Booting a 5.11-rc2 kernel with lockdep enabled inside a virtualbox vm (which still
>>>>> emulates good old piix ATA controllers) I get the below lockdep splat early on during boot:
>>>>>
>>>>> This seems to be led-class related but also seems to have a (P)ATA
>>>>> part to it. To the best of my knowledge this is a new problem in
>>>>> 5.11 .
>>>>
>>>> This is on my for-next branch:
>>>>
>>>> commit 9a5ad5c5b2d25508996f10ee6b428d5df91d9160 (HEAD -> for-next, origin/for-next)
>>>>
>>>> leds: trigger: fix potential deadlock with libata
>>>>
>>>> We have the following potential deadlock condition:
>>>>
>>>> ========================================================
>>>> WARNING: possible irq lock inversion dependency detected
>>>> 5.10.0-rc2+ #25 Not tainted
>>>> --------------------------------------------------------
>>>> swapper/3/0 just changed the state of lock:
>>>> ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
>>>> but this lock took another, HARDIRQ-READ-unsafe lock in the past:
>>>> (&trig->leddev_list_lock){.+.?}-{2:2}
>>>>
>>>> If I'm not mistaken, that should fix your issue.
>>>
>>> I can confirm that this fixes things, thanks.
>>>
>>> I assume that this will be part of some future 5.11 fixes pull-req?
>>
>> This *regression* fix seems to still have not landed in 5.11-rc5, can
>> we please get this on its way to Linus ?
>
> Is it a regression? AFAIK it is a bug that has been there
> forever... My original plan was to simply wait for 5.12, so it gets
> full release of testing...

It may have been a pre-existing bug which got triggered by libata changes?

I don't know. I almost always run all my locally build kernels with lockdep
enabled and as the maintainer of the vboxvideo, vboxguest and vboxsf guest
drivers in the mainline kernel I quite often boot local build kernels inside
a vm.

So I believe that lockdep tripping over this is new in 5.11, which is why
I called it a regression.

And the fix seems very safe and simple, so IMHO it would be good to get
this into 5.11

Regards,

Hans