Re: [PATCH 1/5] rcu/rcuperf: Add kfree_rcu() performance Tests

From: Joel Fernandes
Date: Tue Sep 03 2019 - 16:21:25 EST


On Tue, Sep 03, 2019 at 01:08:49PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 29, 2019 at 04:56:37PM -0400, Joel Fernandes wrote:
> > On Wed, Aug 28, 2019 at 02:12:26PM -0700, Paul E. McKenney wrote:
>
> [ . . . ]
>
> > > > +static int
> > > > +kfree_perf_thread(void *arg)
> > > > +{
> > > > + int i, loop = 0;
> > > > + long me = (long)arg;
> > > > + struct kfree_obj *alloc_ptr;
> > > > + u64 start_time, end_time;
> > > > +
> > > > + VERBOSE_PERFOUT_STRING("kfree_perf_thread task started");
> > > > + set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids));
> > > > + set_user_nice(current, MAX_NICE);
> > > > +
> > > > + start_time = ktime_get_mono_fast_ns();
> > > > +
> > > > + if (atomic_inc_return(&n_kfree_perf_thread_started) >= kfree_nrealthreads) {
> > > > + if (gp_exp)
> > > > + b_rcu_gp_test_started = cur_ops->exp_completed() / 2;
> > >
> > > At some point, it would be good to use the new grace-period
> > > sequence-counter functions (rcuperf_seq_diff(), for example) instead of
> > > the open-coded division by 2. I freely admit that you are just copying
> > > my obsolete hack in this case, so not needed in this patch.
> >
> > But I am using rcu_seq_diff() below in the pr_alert().
> >
> > Anyway, I agree this can be a follow-on since this pattern is borrowed from
> > another part of rcuperf. However, I am also confused about the pattern
> > itself.
> >
> > If I understand, you are doing the "/ 2" because expedited_sequence
> > progresses by 2 for every expedited batch.
> >
> > But does rcu_seq_diff() really work on these expedited GP numbers, and will
> > it be immune to changes in RCU_SEQ_STATE_MASK? Sorry for the silly questions,
> > but admittedly I have not looked too much yet into expedited RCU so I could
> > be missing the point.
>
> Yes, expedited grace periods use the common sequence-number functions.
> Oddly enough, normal grace periods were the last to make use of these.

Ok, will clean up in a follow on patch as we agreed, so as to not block this series.

> > > > + else
> > > > + b_rcu_gp_test_finished = cur_ops->get_gp_seq();
> > > > +
> > > > + pr_alert("Total time taken by all kfree'ers: %llu ns, loops: %d, batches: %ld\n",
> > > > + (unsigned long long)(end_time - start_time), kfree_loops,
> > > > + rcuperf_seq_diff(b_rcu_gp_test_finished, b_rcu_gp_test_started));
> > > > + if (shutdown) {
> > > > + smp_mb(); /* Assign before wake. */
> > > > + wake_up(&shutdown_wq);
> > > > + }
> > > > + }
> > > > +
> > > > + torture_kthread_stopping("kfree_perf_thread");
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static void
> > > > +kfree_perf_cleanup(void)
> > > > +{
> > > > + int i;
> > > > +
> > > > + if (torture_cleanup_begin())
> > > > + return;
> > > > +
> > > > + if (kfree_reader_tasks) {
> > > > + for (i = 0; i < kfree_nrealthreads; i++)
> > > > + torture_stop_kthread(kfree_perf_thread,
> > > > + kfree_reader_tasks[i]);
> > > > + kfree(kfree_reader_tasks);
> > > > + }
> > > > +
> > > > + torture_cleanup_end();
> > > > +}
> > > > +
> > > > +/*
> > > > + * shutdown kthread. Just waits to be awakened, then shuts down system.
> > > > + */
> > > > +static int
> > > > +kfree_perf_shutdown(void *arg)
> > > > +{
> > > > + do {
> > > > + wait_event(shutdown_wq,
> > > > + atomic_read(&n_kfree_perf_thread_ended) >=
> > > > + kfree_nrealthreads);
> > > > + } while (atomic_read(&n_kfree_perf_thread_ended) < kfree_nrealthreads);
> > > > +
> > > > + smp_mb(); /* Wake before output. */
> > > > +
> > > > + kfree_perf_cleanup();
> > > > + kernel_power_off();
> > > > + return -EINVAL;
> > >
> > > These last four lines should be combined with those of
> > > rcu_perf_shutdown(). Actually, you could fold the two functions together
> > > with only a pair of arguments and two one-line wrapper functions, which
> > > would be even better.
> >
> > But the cleanup() function is different in the 2 cases and will have to be
> > passed in as a function pointer. I believe we discussed this last review as
> > well.
>
> Calling through a pointer should be a non-problem in this case. We are
> nowhere near a fastpath.

There's also the wait_event() condition that is different. I don't see how we
can combine this. It will look though and with probably the same lines of
code. Can this function be as is? pretty-please :-D. Or perhaps, if you feel
it is a trivial cleanup, could you do it so I can understand what you mean?
Sorry!

thanks!

- Joel