Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption

From: Paul E. McKenney
Date: Mon Oct 15 2018 - 07:21:18 EST


On Mon, Oct 15, 2018 at 09:05:22AM +0300, Nikolay Borisov wrote:
> On 15.10.2018 05:47, Joel Fernandes wrote:
> > On Sun, Oct 14, 2018 at 07:33:28PM -0700, Paul E. McKenney wrote:
> >> On Sun, Oct 14, 2018 at 07:13:49PM -0700, Joel Fernandes wrote:
> >>> On Sun, Oct 14, 2018 at 07:08:27PM -0700, Joel Fernandes wrote:
> >>>> On Sun, Oct 14, 2018 at 04:17:31PM -0700, Paul E. McKenney wrote:
> >>>>> On Sun, Oct 14, 2018 at 02:29:55PM -0700, Joel Fernandes (Google) wrote:
> >>>>>> The Requirements.html document says "Disabling Preemption Does Not Block
> >>>>>> Grace Periods". However this is no longer true with the RCU
> >>>>>> consolidation. Lets remove the obsolete (non-)requirement entirely.
> >>>>>>
> >>>>>> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> >>>>>
> >>>>> Good catch, queued, thank you!
> >>>>
> >>>> Thanks! By the way after I sent the patch, I also tried Oleg's experiment to
> >>>> confirm that this is indeed obsolete. :)
> >>>>
> >>>> One thing interesting came up when I tried synchronize_rcu_expedited()
> >>>> instead of synchronize_rcu() in Oleg's experiment, I still saw a multiple
> >>>> millisecond delay between when the rcu read section completely and the
> >>>> synchronize_rcu_expedited returns:
> >>>>
> >>>> For example, with synchronize_rcu_expedited, the 'SPIN done' and the 'SYNC
> >>>> done' are about 3 millisecond apart:
> >>>> [ 77.599142] SPIN start
> >>>> [ 77.601595] SYNC start
> >>>> [ 82.604950] SPIN done!
> >>>> [ 82.607836] SYNC done!
> >>>> I saw anywhere from 2-6 milliseconds.
> >>>>
> >>>> The reason I bring this up is according to Requirements.html: In some cases,
> >>>> the multi-millisecond synchronize_rcu() latencies are unacceptable. In these
> >>>> cases, synchronize_rcu_expedited() may be used instead,.. so either I messed
> >>>> something up in the experiment, or I need to update this part of the document ;-)
> >>
> >> In normal testing, 2-6 milliseconds is indeed excessive. Could you please
> >> point me at Oleg's experiment? Also, what CONFIG_PREEMPT setting were
> >> you using? (My guess is CONFIG_PREEMPT=y.)
> >
> > The CONFIG_PREEMPT config I am using is CONFIG_PREEMPT=y.
> >
> >>> So I realized I'm running in Qemu so it could also be a scheduling delay of
> >>> the vcpu thread. So apologies about the noise if the experiment works fine
> >>> for you.
> >>
> >> I used rcuperf, which might not be doing the same thing as Oleg's
> >> experiment.
> >
> > The experiment is mentioned at:
> > https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg912055.html
> >
> > If you apply the below diff, it applies cleanly on rcu/dev. And then run:
> > taskset 2 perl -e 'syscall 157, 666, 5000' &
> > taskset 1 perl -e 'syscall 157, 777'
> >
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index cf5c67533ff1..b654b7566ca3 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -2261,6 +2261,9 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
> > return -EINVAL;
> > }
> >
> > +#include <linux/delay.h>
> > +
> > +
> > SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> > unsigned long, arg4, unsigned long, arg5)
> > {
> > @@ -2274,6 +2277,19 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> >
> > error = 0;
> > switch (option) {
> > + case 666:
> > + preempt_disable();
> > + pr_crit("SPIN start\n");
> > + while (arg2--)
> > + mdelay(1);

OK, this is the problem. When you spin in the kernel for several
milliseconds with preemption disabled, the consolidated grace period
is -required- to wait for this preemption-disabled reader to complete,
whether expedited or not.

So, expected behavior. ;-)

In any case, please don't spin for milliseconds with preemption disabled.
The real-time guys are unlikely to be happy with you if you do this!

> > + pr_crit("SPIN done!\n");
> > + preempt_enable();
> > + break;
> > + case 777:
> > + pr_crit("SYNC start\n");
> > + synchronize_rcu();
> > + pr_crit("SYNC done!\n");
>
> But you are using the console printing infrastructure which is rather
> heavyweight. Try replacing pr_* calls with trace_printk so that you
> write to the lock-free ring buffer, this will reduce the noise from the
> heavy console printing infrastructure.

And this might be a problem as well.

Thanx, Paul

> > + break;
> > case PR_SET_PDEATHSIG:
> > if (!valid_signal(arg2)) {
> > error = -EINVAL;
> >
>