Re: rtmutex deadlock and memory corruption when running gcc testsuite in next-20260303
From: Vlastimil Babka (SUSE)
Date: Thu Mar 05 2026 - 04:12:58 EST
On 3/4/26 21:44, Bert Karwatzki wrote:
> Am Mittwoch, dem 04.03.2026 um 15:15 +0100 schrieb Vlastimil Babka (SUSE):
>> On 3/3/26 11:21 PM, Bert Karwatzki wrote:
>> > I tried building gcc-14 from the debian repositories (fetched via apt-get
>> > source gcc-14) on my new and shiny zen5 machine (Cpu: "AMD Ryzen 9 9950X
>> > 16-Core Processor) running debian stable/trixie and linux-next-20260303
>> > (PREEMPT_RT=y) with the following command:
>>
>> It's probably my fault, sorry about that.
>> Specifically commit 666a739089c from slab/for-next-fixes.
>> Does the following fix it? The fixed commit was meanwhile
>> queued so hopefully will be in next-20260304
>>
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 740edbad0475..1871c5ef354a 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4610,6 +4610,7 @@ __pcs_replace_empty_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
>> }
>>
>> local_unlock(&s->cpu_sheaves->lock);
>> + pcs = NULL;
>>
>> if (!allow_spin)
>> return NULL;
>>
>>
>
> Yes, this seem to fix the issue (two testruns without errors). So the problem was __pcs_replace_empty_main()
> returning a non-NULL pcs in the case where it can't take the lock (&s->cpu_sheaves->lock) and jumps to
> barn_put:.
Thanks for confirming and sorry again!
> Bert Karwatzki