Re: [syzbot] [kernel?] WARNING in flush_cpu_slab

From: Sebastian Andrzej Siewior
Date: Fri May 24 2024 - 02:43:18 EST


On 2024-05-23 23:03:52 [+0200], Vlastimil Babka wrote:
> I'm puzzled by this. We use local_lock_irqsave() on !PREEMPT_RT everywhere.
> IIUC this warning says we did the irqsave() and then found out somebody else
> already set the owner? But that means they also did that irqsave() and set
> themselves as l->owner. Does that mey there would be a spurious irq enable
> that didn't go through local_unlock_irqrestore()?

correct.

>
> Also this particular stack is from the work, which is scheduled by
> queue_work_on() in flush_all_cpus_locked(), which also has a
> lockdep_assert_cpus_held() so it should fullfill the "the caller must ensure
> the cpu doesn't go away" property. But I think even if this ended up on the
> wrong cpu (for the full duration or migrated while processing the work item)
> somehow, it wouldn't be able to cause such warning, but rather corrupt
> something else

Based on

> >> CPU: 3 PID: 5221 Comm: kworker/3:3 Not tainted 6.9.0-syzkaller-10713-g2a8120d7b482 #0

the code was invoked on CPU3 and the kworker was made for CPU3. This is
all fine. All access for the lock in question is within a few lines so
there is no unbalance lock/ unlock or IRQ-unlock which could explain it.

Sebastian