Re: sched: spinlock recursion in sched_rr_get_interval

From: Peter Zijlstra
Date: Mon Jul 07 2014 - 16:06:07 EST


On Mon, Jul 07, 2014 at 09:55:43AM -0400, Sasha Levin wrote:
> I've also had this one, which looks similar:
>
> [10375.005884] BUG: spinlock recursion on CPU#0, modprobe/10965
> [10375.006573] lock: 0xffff8803a0fd7740, .magic: dead4ead, .owner: modprobe/10965, .owner_cpu: 15
> [10375.007412] CPU: 0 PID: 10965 Comm: modprobe Tainted: G W 3.16.0-rc3-next-20140704-sasha-00023-g26c0906-dirty #765

Something's fucked; so we have:

debug_spin_lock_before()
SPIN_BUG_ON(lock->owner == current, "recursion");

Causing that, _HOWEVER_ look at .owner_cpu and the reporting cpu!! How
can the lock owner, own the lock on cpu 15 and again contend with it on
CPU 0. That's impossible.

About when-ish did you start seeing things like this? Lemme go stare
hard at recent changes.

Attachment: pgpJWhcU4v8PZ.pgp
Description: PGP signature