RE: [REPORT] possible circular locking dependency when booting a VM on arm64 host

From: Salil Mehta
Date: Thu Jul 09 2020 - 06:54:54 EST


Hi Yuzenghui,
I will try to reproduce it today at our platform. Just one question is it easily
reproducible or is a rare occurrence?

Thanks
Salil

> From: yuzenghui
> Sent: Thursday, July 9, 2020 11:41 AM
> To: Marc Zyngier <maz@xxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Linux
> Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: Zhuangyuzeng (Yisen) <yisen.zhuang@xxxxxxxxxx>; Salil Mehta
> <salil.mehta@xxxxxxxxxx>; Wanghaibin (D) <wanghaibin.wang@xxxxxxxxxx>
> Subject: [REPORT] possible circular locking dependency when booting a VM on arm64
> host
>
> Hi All,
>
> I had seen the following lockdep splat when booting a guest on my
> Kunpeng 920 with GICv4 enabled. I can also trigger the same splat
> on v5.5 so it should already exist in the kernel for a while. I'm
> not sure what the exact problem is and hope someone can have a look!
>
>
> Thanks,
> Zenghui
>
> [ 103.855511] ======================================================
> [ 103.861664] WARNING: possible circular locking dependency detected
> [ 103.867817] 5.8.0-rc4+ #35 Tainted: G W
> [ 103.872932] ------------------------------------------------------
> [ 103.879083] CPU 2/KVM/20515 is trying to acquire lock:
> [ 103.884200] ffff202fcd5865b0 (&irq_desc_lock_class){-.-.}-{2:2}, at:
> __irq_get_desc_lock+0x60/0xa0
> [ 103.893127]
> but task is already holding lock:
> [ 103.898933] ffff202fcfd07f58 (&rq->lock){-.-.}-{2:2}, at:
> __schedule+0x114/0x8b8
> [ 103.906301]
> which lock already depends on the new lock.
>
> [ 103.914441]
> the existing dependency chain (in reverse order) is:
> [ 103.921888]
> -> #3 (&rq->lock){-.-.}-{2:2}:
> [ 103.927438] _raw_spin_lock+0x54/0x70
> [ 103.931605] task_fork_fair+0x48/0x150
> [ 103.935860] sched_fork+0x100/0x268
> [ 103.939856] copy_process+0x628/0x1868
> [ 103.944106] _do_fork+0x74/0x710
> [ 103.947840] kernel_thread+0x78/0xa0
> [ 103.951917] rest_init+0x30/0x270
> [ 103.955742] arch_call_rest_init+0x14/0x1c
> [ 103.960339] start_kernel+0x534/0x568
> [ 103.964503]
> -> #2 (&p->pi_lock){-.-.}-{2:2}:
> [ 103.970224] _raw_spin_lock_irqsave+0x70/0x98
> [ 103.975080] try_to_wake_up+0x5c/0x5b0
> [ 103.979330] wake_up_process+0x28/0x38
> [ 103.983581] create_worker+0x128/0x1b8
> [ 103.987834] workqueue_init+0x308/0x3bc
> [ 103.992172] kernel_init_freeable+0x180/0x33c
> [ 103.997027] kernel_init+0x18/0x118
> [ 104.001020] ret_from_fork+0x10/0x18
> [ 104.005097]
> -> #1 (&pool->lock){-.-.}-{2:2}:
> [ 104.010817] _raw_spin_lock+0x54/0x70
> [ 104.014983] __queue_work+0x120/0x6e8
> [ 104.019146] queue_work_on+0xa0/0xd8
> [ 104.023225] irq_set_affinity_locked+0xa8/0x178
> [ 104.028253] __irq_set_affinity+0x5c/0x90
> [ 104.032762] irq_set_affinity_hint+0x74/0xb0
> [ 104.037540] hns3_nic_init_irq+0xe0/0x210 [hns3]
> [ 104.042655] hns3_client_init+0x2d8/0x4e0 [hns3]
> [ 104.047779] hclge_init_client_instance+0xf0/0x3a8 [hclge]
> [ 104.053760] hnae3_init_client_instance.part.3+0x30/0x68 [hnae3]
> [ 104.060257] hnae3_register_ae_dev+0x100/0x1f0 [hnae3]
> [ 104.065892] hns3_probe+0x60/0xa8 [hns3]
> [ 104.070319] local_pci_probe+0x44/0x98
> [ 104.074573] work_for_cpu_fn+0x20/0x30
> [ 104.078823] process_one_work+0x258/0x618
> [ 104.083333] worker_thread+0x1c0/0x438
> [ 104.087585] kthread+0x120/0x128
> [ 104.091318] ret_from_fork+0x10/0x18
> [ 104.095394]
> -> #0 (&irq_desc_lock_class){-.-.}-{2:2}:
> [ 104.101895] __lock_acquire+0x11bc/0x1530
> [ 104.106406] lock_acquire+0x100/0x3f8
> [ 104.110570] _raw_spin_lock_irqsave+0x70/0x98
> [ 104.115426] __irq_get_desc_lock+0x60/0xa0
> [ 104.120021] irq_set_vcpu_affinity+0x48/0xc8
> [ 104.124793] its_make_vpe_non_resident+0x6c/0xc0
> [ 104.129910] vgic_v4_put+0x64/0x70
> [ 104.133815] vgic_v3_put+0x28/0x100
> [ 104.137806] kvm_vgic_put+0x3c/0x60
> [ 104.141801] kvm_arch_vcpu_put+0x38/0x58
> [ 104.146228] kvm_sched_out+0x38/0x58
> [ 104.150306] __schedule+0x554/0x8b8
> [ 104.154298] schedule+0x50/0xe0
> [ 104.157946] kvm_arch_vcpu_ioctl_run+0x644/0x9e8
> [ 104.163063] kvm_vcpu_ioctl+0x4b4/0x918
> [ 104.167403] ksys_ioctl+0xb4/0xd0
> [ 104.171222] __arm64_sys_ioctl+0x28/0xc8
> [ 104.175647] el0_svc_common.constprop.2+0x74/0x138
> [ 104.180935] do_el0_svc+0x34/0xa0
> [ 104.184755] el0_sync_handler+0xec/0x128
> [ 104.189179] el0_sync+0x140/0x180
> [ 104.192997]
> other info that might help us debug this:
>
> [ 104.200962] Chain exists of:
> &irq_desc_lock_class --> &p->pi_lock --> &rq->lock
>
> [ 104.211261] Possible unsafe locking scenario:
>
> [ 104.217152] CPU0 CPU1
> [ 104.221660] ---- ----
> [ 104.226170] lock(&rq->lock);
> [ 104.229210] lock(&p->pi_lock);
> [ 104.234930] lock(&rq->lock);
> [ 104.240474] lock(&irq_desc_lock_class);
> [ 104.244465]
> *** DEADLOCK ***
>
> [ 104.250356] 2 locks held by CPU 2/KVM/20515:
> [ 104.254606] #0: ffff202fa95680c8 (&vcpu->mutex){+.+.}-{3:3}, at:
> kvm_vcpu_ioctl+0x80/0x918
> [ 104.262921] #1: ffff202fcfd07f58 (&rq->lock){-.-.}-{2:2}, at:
> __schedule+0x114/0x8b8
> [ 104.270717]
> stack backtrace:
> [ 104.275057] CPU: 73 PID: 20515 Comm: CPU 2/KVM Kdump: loaded Tainted:
> G W 5.8.0-rc4+ #35
> [ 104.284404] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS
> 1.05 09/18/2019
> [ 104.291851] Call trace:
> [ 104.294289] dump_backtrace+0x0/0x218
> [ 104.297935] show_stack+0x2c/0x38
> [ 104.301235] dump_stack+0xf0/0x168
> [ 104.304622] print_circular_bug.isra.39+0x22c/0x280
> [ 104.309476] check_noncircular+0x158/0x1c8
> [ 104.313555] __lock_acquire+0x11bc/0x1530
> [ 104.317545] lock_acquire+0x100/0x3f8
> [ 104.321190] _raw_spin_lock_irqsave+0x70/0x98
> [ 104.325529] __irq_get_desc_lock+0x60/0xa0
> [ 104.329606] irq_set_vcpu_affinity+0x48/0xc8
> [ 104.333858] its_make_vpe_non_resident+0x6c/0xc0
> [ 104.338454] vgic_v4_put+0x64/0x70
> [ 104.341840] vgic_v3_put+0x28/0x100
> [ 104.345314] kvm_vgic_put+0x3c/0x60
> [ 104.348789] kvm_arch_vcpu_put+0x38/0x58
> [ 104.352694] kvm_sched_out+0x38/0x58
> [ 104.356256] __schedule+0x554/0x8b8
> [ 104.359730] schedule+0x50/0xe0
> [ 104.362859] kvm_arch_vcpu_ioctl_run+0x644/0x9e8
> [ 104.367457] kvm_vcpu_ioctl+0x4b4/0x918
> [ 104.371277] ksys_ioctl+0xb4/0xd0
> [ 104.374577] __arm64_sys_ioctl+0x28/0xc8
> [ 104.378482] el0_svc_common.constprop.2+0x74/0x138
> [ 104.383250] do_el0_svc+0x34/0xa0
> [ 104.386552] el0_sync_handler+0xec/0x128
> [ 104.390458] el0_sync+0x140/0x180