Re: [PATCH] lglock: Use spinlock_t instead of arch_spinlock_t

From: Daniel Wagner
Date: Mon Mar 30 2015 - 02:08:23 EST


On 03/26/2015 05:03 PM, Peter Zijlstra wrote:
> On Thu, Mar 26, 2015 at 04:02:08PM +0100, Daniel Wagner wrote:
>> @@ -67,9 +67,9 @@ void lg_global_lock(struct lglock *lg)
>> preempt_disable();
>> lock_acquire_exclusive(&lg->lock_dep_map, 0, 0, NULL, _RET_IP_);
>> for_each_possible_cpu(i) {
>> - arch_spinlock_t *lock;
>> + spinlock_t *lock;
>> lock = per_cpu_ptr(lg->lock, i);
>> - arch_spin_lock(lock);
>> + spin_lock(lock);
>> }
>> }
>
> Nope, that'll blow up in two separate places.
>
> One: lockdep, it can only track a limited number of held locks, and it
> will further report a recursion warning on the 2nd cpu.

I was wondering why I haven't seen it explode. As it turns out I haven't
looked closely enough at dmesg:

[ +0.001231] BUG: MAX_LOCK_DEPTH too low!
[ +0.000092] turning off the locking correctness validator.
[ +0.000092] Please attach the output of /proc/lock_stat to the bug report
[ +0.000094] depth: 48 max: 48!
[ +0.000087] 48 locks held by swapper/0/1:
[ +0.000090] #0: (cpu_hotplug.lock){++++++}, at: [<ffffffff8109e767>] get_online_cpus+0x37/0x80
[ +0.000503] #1: (stop_cpus_mutex){+.+...}, at: [<ffffffff8114b889>] stop_cpus+0x29/0x60
[ +0.000504] #2: (stop_cpus_lock){+.+...}, at: [<ffffffff8114b2de>] queue_stop_cpus_work+0x7e/0xd0
[ +0.000582] #3: (stop_cpus_lock_lock){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000496] #4: (stop_cpus_lock_lock#2){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000577] #5: (stop_cpus_lock_lock#3){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000581] #6: (stop_cpus_lock_lock#4){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000576] #7: (stop_cpus_lock_lock#5){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000576] #8: (stop_cpus_lock_lock#6){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000575] #9: (stop_cpus_lock_lock#7){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000817] #10: (stop_cpus_lock_lock#8){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001057] #11: (stop_cpus_lock_lock#9){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001057] #12: (stop_cpus_lock_lock#10){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001055] #13: (stop_cpus_lock_lock#11){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #14: (stop_cpus_lock_lock#12){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001056] #15: (stop_cpus_lock_lock#13){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001054] #16: (stop_cpus_lock_lock#14){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001053] #17: (stop_cpus_lock_lock#15){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001052] #18: (stop_cpus_lock_lock#16){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #19: (stop_cpus_lock_lock#17){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001072] #20: (stop_cpus_lock_lock#18){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001057] #21: (stop_cpus_lock_lock#19){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001060] #22: (stop_cpus_lock_lock#20){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001060] #23: (stop_cpus_lock_lock#21){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001060] #24: (stop_cpus_lock_lock#22){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001065] #25: (stop_cpus_lock_lock#23){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001055] #26: (stop_cpus_lock_lock#24){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001058] #27: (stop_cpus_lock_lock#25){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001055] #28: (stop_cpus_lock_lock#26){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001054] #29: (stop_cpus_lock_lock#27){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001052] #30: (stop_cpus_lock_lock#28){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #31: (stop_cpus_lock_lock#29){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001056] #32: (stop_cpus_lock_lock#30){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001149] #33: (stop_cpus_lock_lock#31){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001060] #34: (stop_cpus_lock_lock#32){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001063] #35: (stop_cpus_lock_lock#33){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001064] #36: (stop_cpus_lock_lock#34){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001061] #37: (stop_cpus_lock_lock#35){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001054] #38: (stop_cpus_lock_lock#36){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #39: (stop_cpus_lock_lock#37){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #40: (stop_cpus_lock_lock#38){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001057] #41: (stop_cpus_lock_lock#39){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001056] #42: (stop_cpus_lock_lock#40){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001055] #43: (stop_cpus_lock_lock#41){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001055] #44: (stop_cpus_lock_lock#42){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001059] #45: (stop_cpus_lock_lock#43){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001051] #46: (stop_cpus_lock_lock#44){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001123] #47: (stop_cpus_lock_lock#45){+.+...}, at: [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.001058] INFO: lockdep is turned off.
[ +0.000330] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc5-00001-g70ed1b1 #31
[ +0.000581] Hardware name: Dell Inc. PowerEdge R820/066N7P, BIOS 2.0.20 01/16/2014
[ +0.000582] 0000000000000000 000000002752ae20 ffff881fb119ba88 ffffffff817dcbc1
[ +0.000895] 0000000000000000 ffff885fb14f8000 ffff881fb119bb88 ffffffff810ed5aa
[ +0.000899] ffffffff82885150 ffff885fb14f8000 0000000000000292 0000000000000000
[ +0.000893] Call Trace:
[ +0.000329] [<ffffffff817dcbc1>] dump_stack+0x4c/0x65
[ +0.000333] [<ffffffff810ed5aa>] __lock_acquire+0xfca/0x1f90
[ +0.000334] [<ffffffff8110f72f>] ? rcu_irq_exit+0x7f/0xd0
[ +0.000333] [<ffffffff817e766c>] ? retint_restore_args+0x13/0x13
[ +0.000335] [<ffffffff810ef5a7>] lock_acquire+0xc7/0x160
[ +0.000334] [<ffffffff810f2af6>] ? lg_global_lock+0x66/0x90
[ +0.000334] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50
[ +0.000335] [<ffffffff817e59dd>] _raw_spin_lock+0x3d/0x80
[ +0.000336] [<ffffffff810f2af6>] ? lg_global_lock+0x66/0x90
[ +0.000335] [<ffffffff810f2af6>] lg_global_lock+0x66/0x90
[ +0.000333] [<ffffffff8114b2de>] ? queue_stop_cpus_work+0x7e/0xd0
[ +0.000338] [<ffffffff8114b2de>] queue_stop_cpus_work+0x7e/0xd0
[ +0.000335] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50
[ +0.000337] [<ffffffff8114b538>] __stop_cpus+0x58/0xa0
[ +0.000335] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50
[ +0.000334] [<ffffffff8114b130>] ? cpu_stop_should_run+0x50/0x50
[ +0.000335] [<ffffffff8114b897>] stop_cpus+0x37/0x60
[ +0.000339] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0
[ +0.000337] [<ffffffff8114ba15>] __stop_machine+0xf5/0x130
[ +0.000334] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0
[ +0.000334] [<ffffffff810444e0>] ? mtrr_restore+0xb0/0xb0
[ +0.000337] [<ffffffff8114ba7e>] stop_machine+0x2e/0x50
[ +0.000330] [<ffffffff81044efb>] mtrr_aps_init+0x7b/0x90
[ +0.000433] [<ffffffff81f3faac>] native_smp_cpus_done+0x10b/0x113
[ +0.000335] [<ffffffff81f51d74>] smp_init+0x78/0x80
[ +0.000332] [<ffffffff81f2d1e1>] kernel_init_freeable+0x167/0x28d
[ +0.000336] [<ffffffff817d2690>] ? rest_init+0xd0/0xd0
[ +0.000334] [<ffffffff817d269e>] kernel_init+0xe/0xf0
[ +0.000336] [<ffffffff817e6918>] ret_from_fork+0x58/0x90
[ +0.000333] [<ffffffff817d2690>] ? rest_init+0xd0/0xd0


and after that lockdep is disabled. /me feeling extremely stupid.

> Second: preempt_count_add(), spin_lock() does preempt_disable(), with
> enough CPUs you'll overflow the preempt counter (255).

Thanks for the review.

cheers,
daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/