Re: select_task_rq_fair: WARNING: at kernel/lockdep.cmatch_held_lock

From: Peter Zijlstra
Date: Thu Aug 04 2011 - 11:14:27 EST


On Thu, 2011-08-04 at 17:05 +0200, Peter Zijlstra wrote:
> On Thu, 2011-08-04 at 17:13 +0300, Sergey Senozhatsky wrote:
> > Hello,
> > Got the following trace on 3.0-git19 (07865-g1280ea8):
> >
> > [ 132.794685] WARNING: at kernel/lockdep.c:3117 match_held_lock+0xf6/0x12e()
> > [ 132.794687] Hardware name: Aspire 5741G
> > [ 132.794689] Modules linked in: kvm_intel kvm tun ipv6 microcode snd_hda_codec_hdmi snd_hda_codec_realtek broadcom snd_hda_intel snd_hda_codec tg3 snd_pcm snd_timer snd soundcore acer_wmi evdev libphy sparse_keymap psmouse snd_page_alloc
> > pcspkr battery ac wmi button ehci_hcd sr_mod cdrom usbcore sd_mod ahci
> > [ 132.794731] Pid: 4029, comm: qemu-system-x86 Not tainted 3.1.0-dbg-07865-g1280ea8-dirty #668
> > [ 132.794733] Call Trace:
> > [ 132.794736] <IRQ> [<ffffffff8103e4e0>] warn_slowpath_common+0x7e/0x96
> > [ 132.794744] [<ffffffff8103e50d>] warn_slowpath_null+0x15/0x17
> > [ 132.794748] [<ffffffff8106dcee>] match_held_lock+0xf6/0x12e
> > [ 132.794751] [<ffffffff8106dd88>] lock_is_held+0x62/0xa6
> > [ 132.794757] [<ffffffff81086471>] cgroup_lock_is_held+0x10/0x12
> > [ 132.794762] [<ffffffff810368a2>] set_task_cpu+0x1ac/0x3e3
> > [ 132.794766] [<ffffffff8103856a>] ? select_task_rq_fair+0x5c0/0x9ca
> > [ 132.794769] [<ffffffff8103748d>] ? try_to_wake_up+0x29/0x28b
> > [ 132.794773] [<ffffffff8103748d>] ? try_to_wake_up+0x29/0x28b
> > [ 132.794779] [<ffffffff812552a5>] ? do_raw_spin_lock+0x6b/0x122
> > [ 132.794783] [<ffffffff81037603>] try_to_wake_up+0x19f/0x28b
> > [ 132.794787] [<ffffffff810603ed>] ? update_rmtp+0x65/0x65
> > [ 132.794790] [<ffffffff8103770e>] wake_up_process+0x10/0x12
> > [ 132.794794] [<ffffffff8106040a>] hrtimer_wakeup+0x1d/0x21
> > [ 132.794797] [<ffffffff81060816>] __run_hrtimer+0x1b1/0x372
> > [ 132.794800] [<ffffffff810613a2>] hrtimer_interrupt+0xe6/0x1b0
> > [ 132.794805] [<ffffffff810185d5>] smp_apic_timer_interrupt+0x80/0x93
> > [ 132.794810] [<ffffffff81493af3>] apic_timer_interrupt+0x73/0x80
> > [ 132.794812] <EOI> [<ffffffff810fb998>] ? do_mmu_notifier_register+0x66/0x125
> > [ 132.794822] [<ffffffff810ec132>] ? mm_take_all_locks+0x10b/0x165
> > [ 132.794826] [<ffffffff810ec160>] ? mm_take_all_locks+0x139/0x165
> > [ 132.794829] [<ffffffff810ec132>] ? mm_take_all_locks+0x10b/0x165
> > [ 132.794832] [<ffffffff810fb9a0>] do_mmu_notifier_register+0x6e/0x125
> > [ 132.794836] [<ffffffff810fba72>] mmu_notifier_register+0xe/0x10
> > [ 132.794852] [<ffffffffa01fcc0d>] kvm_dev_ioctl+0x297/0x400 [kvm]
> > [ 132.794857] [<ffffffff81119022>] do_vfs_ioctl+0x46c/0x4ad
> > [ 132.794862] [<ffffffff8110a68d>] ? fget_light+0xed/0x2a7
> > [ 132.794867] [<ffffffff81492fca>] ? sysret_check+0x2e/0x69
> > [ 132.794871] [<ffffffff811190b4>] sys_ioctl+0x51/0x75
> > [ 132.794875] [<ffffffff81492f92>] system_call_fastpath+0x16/0x1b
> > [ 132.794877] ---[ end trace 298584c4014cd2b8 ]---
>
> Curious, how easy is that to reproduce? That really shouldn't happen and
> its not immediately obvious how it could happen.

In particular, mm_take_all_locks() which is called from
do_mmu_notifier_register() uses mutex_lock_nest_lock() in both
vm_lock_anon_vma() and vm_lock_mapping(), both times using mm->mmap_sem
as the nest lock.

As per __lock_acquire() any lock that passes in a nest_lock will set
hlock->references and also assign this nest_lock to hlock->nest_lock,
and as per lock_acquire() all that is done with IRQs disabled, so the
interrupt in question should not be able to observe the state where
->references is set, but ->nest_lock is not.

So I'm at a loss explaining how match_held_lock() observes exactly that,
a lock for which ->references is set, but no ->nest_lock, that should be
impossible.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/