Re: [PATCH RFC v1 2/2] rcuperf: Add kfree_rcu performance Tests

From: Joel Fernandes
Date: Fri Aug 09 2019 - 12:01:59 EST


On Wed, Aug 07, 2019 at 10:56:57AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 07, 2019 at 06:22:13AM -0400, Joel Fernandes wrote:
> > On Tue, Aug 06, 2019 at 05:29:15PM -0700, Paul E. McKenney wrote:
> > > On Tue, Aug 06, 2019 at 05:20:41PM -0400, Joel Fernandes (Google) wrote:
> > > > This test runs kfree_rcu in a loop to measure performance of the new
> > > > kfree_rcu, with and without patch.
> > > >
> > > > To see improvement, run with boot parameters:
> > > > rcuperf.kfree_loops=2000 rcuperf.kfree_alloc_num=100 rcuperf.perf_type=kfree
> > > >
> > > > Without patch, test runs in 6.9 seconds.
> > > > With patch, test runs in 6.1 seconds (+13% improvement)
> > > >
> > > > If it is desired to run the test but with the traditional (non-batched)
> > > > kfree_rcu, for example to compare results, then you could pass along the
> > > > rcuperf.kfree_no_batch=1 boot parameter.
> > >
> > > You lost me on this one. You ran two runs, with rcuperf.kfree_no_batch=1
> > > and without? Or you ran this patch both with and without the earlier
> > > patch, and could have run with the patch and rcuperf.kfree_no_batch=1?
> >
> > I always run the rcutorture test with patch because the patch doesn't really
> > do anything if rcuperf.kfree_no_batch=0. This parameter is added so that in
> > the future folks can compare effect of non-batching with that of the
> > batching. However, I can also remove the patch itself and run this test
> > again.
> >
> > > If the latter, it would be good to try all three.
> >
> > Ok, sure.
>
> Very good! And please make the commit log more clear. ;-)

Sure will do :)

> > > > + long me = (long)arg;
> > > > + struct kfree_obj **alloc_ptrs;
> > > > + u64 start_time, end_time;
> > > > +
> > > > + VERBOSE_PERFOUT_STRING("kfree_perf_thread task started");
> > > > + set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids));
> > > > + set_user_nice(current, MAX_NICE);
> > > > + atomic_inc(&n_kfree_perf_thread_started);
> > > > +
> > > > + alloc_ptrs = (struct kfree_obj **)kmalloc(sizeof(struct kfree_obj *) * kfree_alloc_num,
> > > > + GFP_KERNEL);
> > > > + if (!alloc_ptrs)
> > > > + return -ENOMEM;
> > > > +
> > > > + start_time = ktime_get_mono_fast_ns();
> > >
> > > Don't you want to announce that you started here rather than above in
> > > order to avoid (admittedly slight) measurement inaccuracies?
> >
> > I did not follow, are you referring to the measurement inaccuracy related to
> > the "kfree_perf_thread task started" string print? Or, are you saying that
> > ktime_get_mono_fast_ns() has to start earlier than over here?
>
> I am referring to the atomic_inc().

Oh yes, great catch. I will increment closer to the test's actual start.
thanks!

> > (I will reply to the rest of the comments below in a bit, I am going to a
> > hospital now to visit a sick relative and will be back a bit later.)
>
> Ouch!!! I hope that goes as well as it possibly can! And please don't
> neglect your relative on RCU's account!!!

Thanks! it went quite well and now I am back to work ;-)

- Joel