Re: [syzbot] WARNING: suspicious RCU usage in get_timespec64

From: Matthew Wilcox
Date: Sun Apr 04 2021 - 23:10:26 EST


On Sun, Apr 04, 2021 at 02:40:30PM -0700, Paul E. McKenney wrote:
> On Sun, Apr 04, 2021 at 10:38:41PM +0200, Thomas Gleixner wrote:
> > On Sun, Apr 04 2021 at 12:05, syzbot wrote:
> >
> > Cc + ...
>
> And a couple more...
>
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 5e46d1b7 reiserfs: update reiserfs_xattrs_initialized() co..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1125f831d00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=88e4f02896967fe1ab0d
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+88e4f02896967fe1ab0d@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > =============================
> > > WARNING: suspicious RCU usage
> > > 5.12.0-rc5-syzkaller #0 Not tainted
> > > -----------------------------
> > > kernel/sched/core.c:8294 Illegal context switch in RCU-sched read-side critical section!
> > >
> > > other info that might help us debug this:
> > >
> > >
> > > rcu_scheduler_active = 2, debug_locks = 0
> > > 3 locks held by syz-executor.4/8418:
> > > #0:
> > > ffff8880751d2b28
> > > (
> > > &p->pi_lock
> > > ){-.-.}-{2:2}
> > > , at: try_to_wake_up+0x98/0x14a0 kernel/sched/core.c:3345
> > > #1:
> > > ffff8880b9d35258
> > > (
> > > &rq->lock
> > > ){-.-.}-{2:2}
> > > , at: rq_lock kernel/sched/sched.h:1321 [inline]
> > > , at: ttwu_queue kernel/sched/core.c:3184 [inline]
> > > , at: try_to_wake_up+0x5e6/0x14a0 kernel/sched/core.c:3464
> > > #2: ffff8880b9d1f948 (&per_cpu_ptr(group->pcpu, cpu)->seq){-.-.}-{0:0}, at: psi_task_change+0x142/0x220 kernel/sched/psi.c:807
>
> This looks similar to syzbot+dde0cc33951735441301@xxxxxxxxxxxxxxxxxxxxxxxxx
> in that rcu_sleep_check() sees an RCU lock held, but the later call to
> lockdep_print_held_locks() does not. Did something change recently that
> could let the ->lockdep_depth counter get out of sync with the actual
> number of locks held?

Dmitri had a different theory here:

https://groups.google.com/g/syzkaller-bugs/c/FmYvfZCZzqA/m/nc2CXUgsAgAJ