Re: Finally starting on short RCU grace periods, but...

From: Dmitry Vyukov
Date: Thu Aug 06 2020 - 13:05:39 EST


On Thu, Aug 6, 2020 at 3:22 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Thu, Aug 6, 2020 at 12:31 PM Marco Elver <elver@xxxxxxxxxx> wrote:
> >
> > +Cc kasan-dev
> >
> > On Thu, 6 Aug 2020 at 01:08, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > Hello!
> > >
> > > If I remember correctly, one of you asked for a way to shorten RCU
> > > grace periods so that KASAN would have a better chance of detecting bugs
> > > such as pointers being leaked out of RCU read-side critical sections.
> > > I am finally starting entering and testing code for this, but realized
> > > that I had forgotten a couple of things:
> > >
> > > 1. I don't remember exactly who asked, but I suspect that it was
> > > Kostya. I am using his Reported-by as a placeholder for the
> > > moment, but please let me know if this should be adjusted.
> >
> > It certainly was not me.
> >
> > > 2. Although this work is necessary to detect situtions where
> > > call_rcu() is used to initiate a grace period, there already
> > > exists a way to make short grace periods that are initiated by
> > > synchronize_rcu(), namely, the rcupdate.rcu_expedited kernel
> > > boot parameter. This will cause all calls to synchronize_rcu()
> > > to act like synchronize_rcu_expedited(), resulting in about 2-3
> > > orders of magnitude reduction in grace-period latency on small
> > > systems (say 16 CPUs).
> > >
> > > In addition, I plan to make a few other adjustments that will
> > > increase the probability of KASAN spotting a pointer leak even in the
> > > rcupdate.rcu_expedited case.
> >
> > Thank you, that'll be useful I think.
> >
> > > But if you would like to start this sort of testing on current mainline,
> > > rcupdate.rcu_expedited is your friend!
>
> Hi Paul,
>
> This is great!
>
> I understand it's not a sufficiently challenging way of tracking
> things, but it's simply here ;)
> https://bugzilla.kernel.org/show_bug.cgi?id=208299
> (now we also know who asked for this, +Jann)
>
> I've tested on the latest mainline and with rcupdate.rcu_expedited=1
> it boots to ssh successfully and I see:
> [ 0.369258][ T0] All grace periods are expedited (rcu_expedited).
>
> I have created https://github.com/google/syzkaller/pull/2021 to enable
> it on syzbot.
> On syzbot we generally use only 2-4 CPUs per VM, so it should be even better.
>
> > Do any of you remember some bugs we missed due to this? Can we find
> > them if we add this option?
>
> The problem is that it's hard to remember bugs that were not caught :)
> Here is an approximation of UAFs with free in rcu callback:
> https://groups.google.com/forum/#!searchin/syzkaller-bugs/KASAN$20use-after-free$20rcu_do_batch%7Csort:date
> The ones with low hit count are the ones that we almost did not catch.
> That's the best estimation I can think of. Also potentially we can get
> reproducers for such bugs without reproducers.
> Maybe we will be able to correlate some bugs/reproducers that appear
> soon with this change.

Wait, it was added in 2012?
https://github.com/torvalds/linux/commit/3705b88db0d7cc4