Re: WARNING: bad unlock balance in ovl_llseek

From: Amir Goldstein
Date: Mon Mar 02 2020 - 06:11:01 EST


On Sun, Mar 1, 2020 at 9:13 PM syzbot
<syzbot+66a9752fa927f745385e@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: f8788d86 Linux 5.6-rc3
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=13c5f8f9e00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
> dashboard link: https://syzkaller.appspot.com/bug?extid=66a9752fa927f745385e
> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=131d9a81e00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14117a81e00000
>

Dmitry,

There is something strange about the C repro.
It passes an invalid address for the first arg of mount syscall:

syscall(__NR_mount, 0x400000ul, 0x20000000ul, 0x20000080ul, 0ul,
0x20000100ul);

With this address mount syscall returns -EFAULT on my system.
I fixed this manually, but repro did not trigger the reported bug on my system.

> The bug was bisected to:
>
> commit b1f9d3858f724ed45b279b689fb5b400d91352e3
> Author: Amir Goldstein <amir73il@xxxxxxxxx>
> Date: Sat Dec 21 09:42:29 2019 +0000
>
> ovl: use ovl_inode_lock in ovl_llseek()
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=16ff3bede00000
> final crash: https://syzkaller.appspot.com/x/report.txt?x=15ff3bede00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=11ff3bede00000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+66a9752fa927f745385e@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: b1f9d3858f72 ("ovl: use ovl_inode_lock in ovl_llseek()")
>
> =====================================
> WARNING: bad unlock balance detected!
> 5.6.0-rc3-syzkaller #0 Not tainted
> -------------------------------------
> syz-executor194/8947 is trying to release lock (&ovl_i_lock_key[depth]) at:
> [<ffffffff828b7835>] ovl_inode_unlock fs/overlayfs/overlayfs.h:328 [inline]
> [<ffffffff828b7835>] ovl_llseek+0x215/0x2c0 fs/overlayfs/file.c:193
> but there are no more locks to release!
>

This is strange. I don't see how that can happen nor how my change would
have caused this regression. If anything, the lock chance may have brought
a bug in stack file ops to light, but don't see the bug.

The repro is multi-threaded but when I ran the repro, a single thread did:
- open lower file (pre copy up)
- lchown file (copy up)
- llseek the open file (so llseek is on a temporary ovl_open_realfile())

Perhaps when bug was triggered ops above were executed by different
threads?

Dmitry, I may have asked this before - how hard would it be to attach an
strace of the repro to a bug report?

Thanks,
Amir.