Re: [regression] BUG: KASAN: use-after-free in lockdep_register_key+0x755/0x8f0

From: Kalle Valo
Date: Thu May 30 2024 - 04:18:31 EST


Kalle Valo <kvalo@xxxxxxxxxx> writes:

> Dan Williams <dan.j.williams@xxxxxxxxx> writes:
>
>> Kalle Valo wrote:
>>
>>> Kalle Valo <kvalo@xxxxxxxxxx> writes:
>>>
>>> > Kalle Valo <kvalo@xxxxxxxxxx> writes:
>>> >
>>> >> Yesterday I run our ath11k regression tests with v6.10-rc1 and our
>>> >> simple ath11k module reload stress started failing reliably with various
>>> >> KASAN errors. The test removes and inserts ath11k and other wireless
>>> >> modules in a loop. Usually I run it at least 100 times, some times even
>>> >> more, and no issues until yesterday.
>>> >>
>>> >> I have verified that the last wireless-next pull request (tag
>>> >> wireless-next-2024-05-08) works without issues and v6.10-rc1 fails
>>> >> always, usually within 50 module reload loops. From this I'm _guessing_
>>> >> that we have a regression outside wireless, most probably introduced
>>> >> between v6.9 and v6.10-rc1. But of course I cannot be sure of anything
>>> >> yet.
>>> >>
>>> >> I see different KASAN warnings and lockdep seems to be always visible in
>>> >> the stack traces. I think I can reproduce the issue within 15 minutes or
>>> >> so. Before I start bisecting has anyone else seen anything similar? Or
>>> >> any suggestions how to debug this further?
>>> >>
>>> >> I have included some crash logs below, they are retrieved using
>>> >> netconsole. Here's a summary of the errors:
>>> >>
>>> >> [ 159.970765] KASAN: maybe wild-memory-access in range
>>> >> [0xbbbbbbbbbbbbbbb8-0xbbbbbbbbbbbbbbbf]
>>> >> [ 700.017632] BUG: KASAN: use-after-free in lockdep_register_key+0x755/0x8f0
>>> >> [ 224.695821] BUG: KASAN: slab-out-of-bounds in
>>> >> lockdep_register_key+0x755/0x8f0
>>> >> [ 259.666542] BUG: KASAN: slab-use-after-free in
>>> >> lockdep_register_key+0x755/0x8f0
>>
>> The proposed fix for that is here:
>>
>> http://lore.kernel.org/r/66560aa9dbedb_195e294b0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.notmuch
>
> I get "Not Found" from that link, is there a typo?

I found this fix from for-linus branch:

# PCI: Fix missing lockdep annotation for pci_cfg_access_trylock()for-linus
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=for-linus&id=f941b9182c54a885a9d5d4cfd97af66873c98560

But at least that doesn't fix my crash.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches