Re: [PATCH 2/2] fsnotify: allow sleepable child flag update
From: Yujie Liu
Date: Thu Oct 27 2022 - 04:46:46 EST
On Thu, Oct 27, 2022 at 03:50:17PM +0800, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with clang-14):
>
> commit: bed2685d9557ff9a7705f4172651a138e5f705af ("[PATCH 2/2] fsnotify: allow sleepable child flag update")
> url: https://github.com/intel-lab-lkp/linux/commits/Stephen-Brennan/fsnotify-Protect-i_fsnotify_mask-and-child-flags-with-inode-rwsem/20221018-131326
> base: https://git.kernel.org/cgit/linux/kernel/git/jack/linux-fs.git fsnotify
> patch link: https://lore.kernel.org/linux-fsdevel/20221018041233.376977-3-stephen.s.brennan@xxxxxxxxxx
> patch subject: [PATCH 2/2] fsnotify: allow sleepable child flag update
>
> in testcase: boot
>
> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
Sorry, this report is for the v1 patch which seems to be obsolete now.
Please kindly check the details in report, if the issue has already been
fixed in v2, please ignore this report. Thanks.
--
Best Regards,
Yujie
> [ 31.979147][ T1]
> [ 31.979446][ T1] ============================================
> [ 31.980051][ T1] WARNING: possible recursive locking detected
> [ 31.980674][ T1] 6.0.0-rc4-00066-gbed2685d9557 #1 Not tainted
> [ 31.981286][ T1] --------------------------------------------
> [ 31.981889][ T1] systemd/1 is trying to acquire lock:
> [ 31.982432][ T1] ffff88813f542510 (&dentry->d_lock){+.+.}-{2:2}, at: lockref_get+0xd/0x80
> [ 31.983314][ T1]
> [ 31.983314][ T1] but task is already holding lock:
> [ 31.984040][ T1] ffff888100441b18 (&dentry->d_lock){+.+.}-{2:2}, at: __fsnotify_update_child_dentry_flags+0x85/0x2c0
> [ 31.985132][ T1]
> [ 31.985132][ T1] other info that might help us debug this:
> [ 31.985967][ T1] Possible unsafe locking scenario:
> [ 31.985967][ T1]
> [ 31.986694][ T1] CPU0
> [ 31.987025][ T1] ----
> [ 31.987366][ T1] lock(&dentry->d_lock);
> [ 31.987828][ T1] lock(&dentry->d_lock);
> [ 31.988283][ T1]
> [ 31.988283][ T1] *** DEADLOCK ***
> [ 31.988283][ T1]
> [ 31.989061][ T1] May be due to missing lock nesting notation
> [ 31.989061][ T1]
> [ 31.989888][ T1] 3 locks held by systemd/1:
> [ 31.990361][ T1] #0: ffff88815249e128 (&group->mark_mutex){+.+.}-{3:3}, at: __x64_sys_inotify_add_watch+0x2fc/0xc00
> [ 31.991473][ T1] #1: ffff888100480af8 (&sb->s_type->i_mutex_key){++++}-{3:3}, at: fsnotify_recalc_mask+0xf1/0x1c0
> [ 31.992528][ T1] #2: ffff888100441b18 (&dentry->d_lock){+.+.}-{2:2}, at: __fsnotify_update_child_dentry_flags+0x85/0x2c0
> [ 31.993671][ T1]
> [ 31.993671][ T1] stack backtrace:
> [ 31.994260][ T1] CPU: 0 PID: 1 Comm: systemd Not tainted 6.0.0-rc4-00066-gbed2685d9557 #1 1afcec0fe797aeed18cb95313bac4a75fb6852d3
> [ 31.995440][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> [ 31.996441][ T1] Call Trace:
> [ 31.996791][ T1] <TASK>
> [ 31.997101][ T1] dump_stack_lvl+0x6a/0x100
> [ 31.997590][ T1] __lock_acquire+0x1110/0x7480
> [ 31.998105][ T1] ? mark_lock+0x9a/0x380
> [ 31.998560][ T1] ? mark_held_locks+0xad/0x1c0
> [ 31.999056][ T1] ? lockdep_hardirqs_on_prepare+0x1a8/0x400
> [ 31.999650][ T1] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [ 32.000276][ T1] lock_acquire+0x177/0x480
> [ 32.000739][ T1] ? lockref_get+0xd/0x80
> [ 32.001178][ T1] _raw_spin_lock+0x2f/0x40
> [ 32.001656][ T1] ? lockref_get+0xd/0x80
> [ 32.002093][ T1] lockref_get+0xd/0x80
> [ 32.002529][ T1] __fsnotify_update_child_dentry_flags+0x142/0x2c0
> [ 32.003178][ T1] fsnotify_recalc_mask+0x126/0x1c0
> [ 32.003711][ T1] fsnotify_add_mark_locked+0xd9e/0x1280
> [ 32.004292][ T1] __x64_sys_inotify_add_watch+0x755/0xc00
> [ 32.004898][ T1] ? syscall_enter_from_user_mode+0x26/0x180
> [ 32.005660][ T1] do_syscall_64+0x6d/0xc0
> [ 32.006125][ T1] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [ 32.006735][ T1] RIP: 0033:0x7f839dd0a8f7
> [ 32.007188][ T1] Code: f0 ff ff 73 01 c3 48 8b 0d 96 f5 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 fe 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 69 f5 0b 00 f7 d8 64 89 01 48
> [ 32.009103][ T1] RSP: 002b:00007ffe52095c98 EFLAGS: 00000202 ORIG_RAX: 00000000000000fe
> [ 32.009945][ T1] RAX: ffffffffffffffda RBX: 0000555bc72cf930 RCX: 00007f839dd0a8f7
> [ 32.010685][ T1] RDX: 0000000000000d84 RSI: 0000555bc72cf930 RDI: 000000000000001a
> [ 32.011469][ T1] RBP: 0000555bc72cf931 R08: 00000000fe000000 R09: 0000555bc72a1e90
> [ 32.012266][ T1] R10: 00007ffe52095c2c R11: 0000000000000202 R12: 0000000000000000
> [ 32.012976][ T1] R13: 0000555bc72a1e90 R14: 0000000000000d84 R15: 0000555bc72cf930
> [ 32.013705][ T1] </TASK>
>
>
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot <yujie.liu@xxxxxxxxx>
> | Link: https://lore.kernel.org/oe-lkp/202210271500.731e3808-yujie.liu@xxxxxxxxx
>
>
> To reproduce:
>
> # build kernel
> cd linux
> cp config-6.0.0-rc4-00066-gbed2685d9557 .config
> make HOSTCC=clang-14 CC=clang-14 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> make HOSTCC=clang-14 CC=clang-14 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> cd <mod-install-dir>
> find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
>
>
> git clone https://github.com/intel/lkp-tests.git
> cd lkp-tests
> bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
>
> # if come across any failure that blocks the test,
> # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>