Re: WARNING: suspicious RCU usage in idtentry_exit

From: Paul E. McKenney
Date: Fri May 29 2020 - 12:07:10 EST


On Fri, May 29, 2020 at 04:32:31PM +0200, Dmitry Vyukov wrote:
> On Fri, May 29, 2020 at 4:05 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > On Fri, May 29, 2020 at 08:20:12AM +0200, Dmitry Vyukov wrote:
> > > On Thu, May 28, 2020 at 10:48 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > > >
> > > > On Thu, May 28, 2020 at 10:19:02PM +0200, Thomas Gleixner wrote:
> > > > > Paul,
> > > > >
> > > > > "Paul E. McKenney" <paulmck@xxxxxxxxxx> writes:
> > > > > > On Thu, May 28, 2020 at 03:33:44PM +0200, Thomas Gleixner wrote:
> > > > > >> syzbot <syzbot+3ae5eaae0809ee311e75@xxxxxxxxxxxxxxxxxxxxxxxxx> writes:
> > > > > >> Weird. I have no idea how that thing is an EQS here.
> > > > > >
> > > > > > No argument on the "Weird" part! ;-)
> > > > > >
> > > > > > Is this a NO_HZ_FULL=y kernel?
> > > > >
> > > > > No, it has only NO_HZ_IDLE.
> > > > >
> > > > > https://syzkaller.appspot.com/x/.config?x=47b0740d89299c10
> > > >
> > > > OK, from the .config, another suggestion is to build the kernel
> > > > with CONFIG_RCU_EQS_DEBUG=y. This still requires that this issue be
> > > > reproduced, but it might catch the problem earlier.
> > >
> > > How much does it slow down execution? If we enable it on syzbot, it
> > > will affect all fuzzing done by syzbot always.
> > > It can tolerate significant slowdown and it's far from a production
> > > kernel (it enables KASAN, KCOV, LOCKDEP and more). But I am still
> > > asking because some debugging features are built without performance
> > > in mind at all (like let's just drop a global lock in every
> > > kmalloc/free, which may be too much even for a standard debug build).
> >
> > It is an extra WARN_ON_ONCE() with a simple comparison, but on almost
> > every kernel entry/exit path.
> >
> > So not something you want in production, but much lighter weight than
> > any of the tools you listed above.
> >
> > Full disclosure: It usually fires for new architectures or for new
> > timer hardware/drivers. Which might allow you to enable it selectively.
>
>
> This sounds reasonable. I've enabled it:
> https://github.com/google/syzkaller/commit/3905eaae004605f4ec4dab83e6883173796118c8
> syzbot will pick up within a day or so. Then crashes will have any
> additional checks captured.
>
> The arch/hardware is quite old: x86_64/GCE. It also booted for me in
> qemu without warnings.

Very good, thank you!

Thanx, Paul

> > > > > > If so, one possibility is that the call
> > > > > > to rcu_user_exit() went missing somehow. If not, then RCU should have
> > > > > > been watching userspace execution.
> > > > > >
> > > > > > Again, the only thing I can think of (should this prove to be
> > > > > > reproducible) is the rcu_dyntick trace event.
> > > > >
> > > > > :)
> > > > >
> > > > > Thanks,
> > > > >
> > > > > tglx
> > > >
> > > > Thanx, Paul
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> > > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200528204839.GR2869%40paulmck-ThinkPad-P72.
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200529140521.GA2869%40paulmck-ThinkPad-P72.