Re: [PATCH 1/5] rcu/rcuperf: Add kfree_rcu() performance Tests

From: Paul E. McKenney
Date: Tue Sep 03 2019 - 16:09:23 EST


On Thu, Aug 29, 2019 at 04:56:37PM -0400, Joel Fernandes wrote:
> On Wed, Aug 28, 2019 at 02:12:26PM -0700, Paul E. McKenney wrote:

[ . . . ]

> > > +static int
> > > +kfree_perf_thread(void *arg)
> > > +{
> > > + int i, loop = 0;
> > > + long me = (long)arg;
> > > + struct kfree_obj *alloc_ptr;
> > > + u64 start_time, end_time;
> > > +
> > > + VERBOSE_PERFOUT_STRING("kfree_perf_thread task started");
> > > + set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids));
> > > + set_user_nice(current, MAX_NICE);
> > > +
> > > + start_time = ktime_get_mono_fast_ns();
> > > +
> > > + if (atomic_inc_return(&n_kfree_perf_thread_started) >= kfree_nrealthreads) {
> > > + if (gp_exp)
> > > + b_rcu_gp_test_started = cur_ops->exp_completed() / 2;
> >
> > At some point, it would be good to use the new grace-period
> > sequence-counter functions (rcuperf_seq_diff(), for example) instead of
> > the open-coded division by 2. I freely admit that you are just copying
> > my obsolete hack in this case, so not needed in this patch.
>
> But I am using rcu_seq_diff() below in the pr_alert().
>
> Anyway, I agree this can be a follow-on since this pattern is borrowed from
> another part of rcuperf. However, I am also confused about the pattern
> itself.
>
> If I understand, you are doing the "/ 2" because expedited_sequence
> progresses by 2 for every expedited batch.
>
> But does rcu_seq_diff() really work on these expedited GP numbers, and will
> it be immune to changes in RCU_SEQ_STATE_MASK? Sorry for the silly questions,
> but admittedly I have not looked too much yet into expedited RCU so I could
> be missing the point.

Yes, expedited grace periods use the common sequence-number functions.
Oddly enough, normal grace periods were the last to make use of these.

> > > + else
> > > + b_rcu_gp_test_finished = cur_ops->get_gp_seq();
> > > +
> > > + pr_alert("Total time taken by all kfree'ers: %llu ns, loops: %d, batches: %ld\n",
> > > + (unsigned long long)(end_time - start_time), kfree_loops,
> > > + rcuperf_seq_diff(b_rcu_gp_test_finished, b_rcu_gp_test_started));
> > > + if (shutdown) {
> > > + smp_mb(); /* Assign before wake. */
> > > + wake_up(&shutdown_wq);
> > > + }
> > > + }
> > > +
> > > + torture_kthread_stopping("kfree_perf_thread");
> > > + return 0;
> > > +}
> > > +
> > > +static void
> > > +kfree_perf_cleanup(void)
> > > +{
> > > + int i;
> > > +
> > > + if (torture_cleanup_begin())
> > > + return;
> > > +
> > > + if (kfree_reader_tasks) {
> > > + for (i = 0; i < kfree_nrealthreads; i++)
> > > + torture_stop_kthread(kfree_perf_thread,
> > > + kfree_reader_tasks[i]);
> > > + kfree(kfree_reader_tasks);
> > > + }
> > > +
> > > + torture_cleanup_end();
> > > +}
> > > +
> > > +/*
> > > + * shutdown kthread. Just waits to be awakened, then shuts down system.
> > > + */
> > > +static int
> > > +kfree_perf_shutdown(void *arg)
> > > +{
> > > + do {
> > > + wait_event(shutdown_wq,
> > > + atomic_read(&n_kfree_perf_thread_ended) >=
> > > + kfree_nrealthreads);
> > > + } while (atomic_read(&n_kfree_perf_thread_ended) < kfree_nrealthreads);
> > > +
> > > + smp_mb(); /* Wake before output. */
> > > +
> > > + kfree_perf_cleanup();
> > > + kernel_power_off();
> > > + return -EINVAL;
> >
> > These last four lines should be combined with those of
> > rcu_perf_shutdown(). Actually, you could fold the two functions together
> > with only a pair of arguments and two one-line wrapper functions, which
> > would be even better.
>
> But the cleanup() function is different in the 2 cases and will have to be
> passed in as a function pointer. I believe we discussed this last review as
> well.

Calling through a pointer should be a non-problem in this case. We are
nowhere near a fastpath.

Thanx, Paul