[RT WARNING] DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current) with fsfreeze (4.19.25-rt16)
From: Juri Lelli
Date: Tue Mar 26 2019 - 05:34:27 EST
Hi,
Running this reproducer on a 4.19.25-rt16 kernel (with lock debugging
turned on) produces warning below.
--->8---
# dd if=/dev/zero of=fsfreezetest count=999999
# mkfs -t xfs -q ./fsfreezetest
# mkdir testmount
# mount -t xfs -o loop ./fsfreezetest ./testmount
# for I in `seq 10`; do fsfreeze -f ./testmount; sleep 1; fsfreeze -u ./testmount; done
--->8---
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current)
WARNING: CPU: 10 PID: 1226 at kernel/locking/rtmutex-debug.c:145 debug_rt_mutex_unlock+0x9b/0xb0
Modules linked in: xfs [...]
CPU: 10 PID: 1226 Comm: fsfreeze Not tainted 4.19.25-rt16 #2
Hardware name: LENOVO 30B6S2F900/1030, BIOS S01KT61A 09/28/2018
RIP: 0010:debug_rt_mutex_unlock+0x9b/0xb0
Code: e8 aa af 3c 00 4c 8b 04 24 85 c0 74 a9 8b 05 3c 9c a6 02 85 c0 75 9f 48 c7 c6 b8 b4 2d 98 48 c7 c7 9b d2 2b 98 e8 d9 e5 f8 ff <0f> 0b 4c 8b 04 24 eb 84 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 c3
RSP: 0018:ffffa7efa60cbdd0 EFLAGS: 00010086
RAX: 0000000000000000 RBX: ffff991b72813920 RCX: 0000000000000000
RDX: 0000000000000007 RSI: ffffffff98318de2 RDI: 00000000ffffffff
RBP: 0000000000000246 R08: 0000000000000000 R09: 0000000000024200
R10: 0000000000000000 R11: 0000000000000000 R12: ffffa7efa60cbe08
R13: ffffa7efa60cbe18 R14: ffff991b72813478 R15: ffffffff9730718d
FS: 00007f19baf6a540(0000) GS:ffff991b9fb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f19bae87040 CR3: 000000103c6ee002 CR4: 00000000001606e0
Call Trace:
rt_mutex_slowunlock+0x24/0x70
__rt_mutex_unlock+0x45/0x80
percpu_up_write+0x4b/0x60
thaw_super_locked+0xdb/0x110
do_vfs_ioctl+0x647/0x6f0
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x60/0x1f0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f19bae9704b
Code: 0f 1e fa 48 8b 05 3d be 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d be 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc6d275358 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f19bae9704b
RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc6d275755
R13: 00007ffc6d275500 R14: 0000000000000000 R15: 0000000000000000
irq event stamp: 8002
hardirqs last enabled at (8001): [<ffffffff97a25981>] _raw_spin_unlock_irqrestore+0x81/0x90
hardirqs last disabled at (8002): [<ffffffff97a25aa0>] _raw_spin_lock_irqsave+0x20/0x60
softirqs last enabled at (0): [<ffffffff970c04ad>] copy_process.part.36+0x89d/0x2170
softirqs last disabled at (0): [<0000000000000000>] (null)
---[ end trace 0000000000000002 ]---
AFAIU, this is a legit warning, since
fsfreeze -f ./testmount grabs rt_mutexes embedded into
sb->s_writers.rw_sem[SB_FREEZE_LEVELS] (rt-rwsem) as part of executing
sb_wait_write() (for each FREEZE_LEVEL) in freeze_super().
We then return to userspace.
fsfreeze -u ./testmount unlocks the rt_mutexes while doing
sb_freeze_unlock() in thaw_super_locked(). This is a different process
w.r.t. the one that did the freeze above.
I noticed that a very similar problem was fixed (for !rt rwsem) by
5a817641f68a ("locking/percpu-rwsem: Annotate rwsem ownership transfer
by setting RWSEM_OWNER_UNKNOWN"). However, RT has of course to deal with
PI, so I wonder if there is an easy fix for this problem.
Suggestions?
Thanks,
- Juri