Hi,
On 11/29/2022 2:06 PM, Tonghao Zhang wrote:
On Tue, Nov 29, 2022 at 12:32 PM Hou Tao <houtao1@xxxxxxxxxx> wrote:The proposed lockdep solution doesn't work. Still got lockdep warning after
Hi,
On 11/29/2022 5:55 AM, Hao Luo wrote:
On Sun, Nov 27, 2022 at 7:15 PM Tonghao Zhang <xiangxia.m.yue@xxxxxxxxx> wrote:It would be better if the test case below can be rewritten as a bpf selftests.
Hi Tonghao,
With a quick look at the htab_lock_bucket() and your problem
statement, I agree with Hou Tao that using hash &
min(HASHTAB_MAP_LOCK_MASK, n_bucket - 1) to index in map_locked seems
to fix the potential deadlock. Can you actually send your changes as
v2 so we can take a look and better help you? Also, can you explain
your solution in your commit message? Right now, your commit message
has only a problem statement and is not very clear. Please include
more details on what you do to fix the issue.
Hao
Please see comments below on how to improve it and reproduce the deadlock.
Thanks for your details instruction. I can reproduce the warning by using yourHi
only a warning from lockdep.
setup. I am not a lockdep expert, it seems that fixing such warning needs to set
different lockdep class to the different bucket. Because we use map_locked to
protect the acquisition of bucket lock, so I think we can define lock_class_key
array in bpf_htab (e.g., lockdep_key[HASHTAB_MAP_LOCK_COUNT]) and initialize the
bucket lock accordingly.
that, so cc +locking expert +lkml.org for lockdep help.
Hi lockdep experts,
We are trying to fix the following lockdep warning from bpf subsystem:
[ 36.092222] ================================
[ 36.092230] WARNING: inconsistent lock state
[ 36.092234] 6.1.0-rc5+ #81 Tainted: G E
[ 36.092236] --------------------------------
[ 36.092237] inconsistent {INITIAL USE} -> {IN-NMI} usage.
[ 36.092238] perf/1515 [HC1[1]:SC0[0]:HE0:SE1] takes:
[ 36.092242] ffff888341acd1a0 (&htab->lockdep_key){....}-{2:2}, at:
htab_lock_bucket+0x4d/0x58
[ 36.092253] {INITIAL USE} state was registered at:
[ 36.092255] mark_usage+0x1d/0x11d
[ 36.092262] __lock_acquire+0x3c9/0x6ed
[ 36.092266] lock_acquire+0x23d/0x29a
[ 36.092270] _raw_spin_lock_irqsave+0x43/0x7f
[ 36.092274] htab_lock_bucket+0x4d/0x58
[ 36.092276] htab_map_delete_elem+0x82/0xfb
[ 36.092278] map_delete_elem+0x156/0x1ac
[ 36.092282] __sys_bpf+0x138/0xb71
[ 36.092285] __do_sys_bpf+0xd/0x15
[ 36.092288] do_syscall_64+0x6d/0x84
[ 36.092291] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 36.092295] irq event stamp: 120346
[ 36.092296] hardirqs last enabled at (120345): [<ffffffff8180b97f>]
_raw_spin_unlock_irq+0x24/0x39
[ 36.092299] hardirqs last disabled at (120346): [<ffffffff81169e85>]
generic_exec_single+0x40/0xb9
[ 36.092303] softirqs last enabled at (120268): [<ffffffff81c00347>]
__do_softirq+0x347/0x387
[ 36.092307] softirqs last disabled at (120133): [<ffffffff810ba4f0>]
__irq_exit_rcu+0x67/0xc6
[ 36.092311]
[ 36.092311] other info that might help us debug this:
[ 36.092312] Possible unsafe locking scenario:
[ 36.092312]
[ 36.092313] CPU0
[ 36.092313] ----
[ 36.092314] lock(&htab->lockdep_key);
[ 36.092315] <Interrupt>
[ 36.092316] lock(&htab->lockdep_key);
[ 36.092318]
[ 36.092318] *** DEADLOCK ***
[ 36.092318]
[ 36.092318] 3 locks held by perf/1515:
[ 36.092320] #0: ffff8881b9805cc0 (&cpuctx_mutex){+.+.}-{4:4}, at:
perf_event_ctx_lock_nested+0x8e/0xba
[ 36.092327] #1: ffff8881075ecc20 (&event->child_mutex){+.+.}-{4:4}, at:
perf_event_for_each_child+0x35/0x76
[ 36.092332] #2: ffff8881b9805c20 (&cpuctx_lock){-.-.}-{2:2}, at:
perf_ctx_lock+0x12/0x27
[ 36.092339]
[ 36.092339] stack backtrace:
[ 36.092341] CPU: 0 PID: 1515 Comm: perf Tainted: G E
6.1.0-rc5+ #81
[ 36.092344] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 36.092349] Call Trace:
[ 36.092351] <NMI>
[ 36.092354] dump_stack_lvl+0x57/0x81
[ 36.092359] lock_acquire+0x1f4/0x29a
[ 36.092363] ? handle_pmi_common+0x13f/0x1f0
[ 36.092366] ? htab_lock_bucket+0x4d/0x58
[ 36.092371] _raw_spin_lock_irqsave+0x43/0x7f
[ 36.092374] ? htab_lock_bucket+0x4d/0x58
[ 36.092377] htab_lock_bucket+0x4d/0x58
[ 36.092379] htab_map_update_elem+0x11e/0x220
[ 36.092386] bpf_prog_f3a535ca81a8128a_bpf_prog2+0x3e/0x42
[ 36.092392] trace_call_bpf+0x177/0x215
[ 36.092398] perf_trace_run_bpf_submit+0x52/0xaa
[ 36.092403] ? x86_pmu_stop+0x97/0x97
[ 36.092407] perf_trace_nmi_handler+0xb7/0xe0
[ 36.092415] nmi_handle+0x116/0x254
[ 36.092418] ? x86_pmu_stop+0x97/0x97
[ 36.092423] default_do_nmi+0x3d/0xf6
[ 36.092428] exc_nmi+0xa1/0x109
[ 36.092432] end_repeat_nmi+0x16/0x67
[ 36.092436] RIP: 0010:wrmsrl+0xd/0x1b