Re: [PATCH] net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

From: Seth Forshee
Date: Tue Oct 26 2021 - 08:27:47 EST


On Mon, Oct 25, 2021 at 12:48:28PM -0700, Jakub Kicinski wrote:
> On Fri, 22 Oct 2021 11:17:46 -0500 Seth Forshee wrote:
> > From: Seth Forshee <sforshee@xxxxxxxxxxxxxxxx>
> >
> > Currently rcu_barrier() is used to ensure that no readers of the
> > inactive mini_Qdisc buffer remain before it is reused. This waits for
> > any pending RCU callbacks to complete, when all that is actually
> > required is to wait for one RCU grace period to elapse after the buffer
> > was made inactive. This means that using rcu_barrier() may result in
> > unnecessary waits.
> >
> > To improve this, store the current RCU state when a buffer is made
> > inactive and use poll_state_synchronize_rcu() to check whether a full
> > grace period has elapsed before reusing it. If a full grace period has
> > not elapsed, wait for a grace period to elapse, and in the non-RT case
> > use synchronize_rcu_expedited() to hasten it.
> >
> > Since this approach eliminates the RCU callback it is no longer
> > necessary to synchronize_rcu() in the tp_head==NULL case. However, the
> > RCU state should still be saved for the previously active buffer.
> >
> > Before this change I would typically see mini_qdisc_pair_swap() take
> > tens of milliseconds to complete. After this change it typcially
> > finishes in less than 1 ms, and often it takes just a few microseconds.
> >
> > Thanks to Paul for walking me through the options for improving this.
> >
> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > Signed-off-by: Seth Forshee <sforshee@xxxxxxxxxxxxxxxx>
>
> LGTM, but please rebase and retest on top of latest net-next.

Will do.

> > void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
> > struct tcf_proto *tp_head)
> > {
> > @@ -1423,28 +1419,30 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp,
> >
> > if (!tp_head) {
> > RCU_INIT_POINTER(*miniqp->p_miniq, NULL);
> > - /* Wait for flying RCU callback before it is freed. */
> > - rcu_barrier();
> > - return;
> > - }
> > + } else {
> > + miniq = !miniq_old || miniq_old == &miniqp->miniq2 ?
> > + &miniqp->miniq1 : &miniqp->miniq2;
> >
> > - miniq = !miniq_old || miniq_old == &miniqp->miniq2 ?
> > - &miniqp->miniq1 : &miniqp->miniq2;
>
> nit: any reason this doesn't read:
>
> miniq = miniq_old != &miniqp->miniq1 ?
> &miniqp->miniq1 : &miniqp->miniq2;
>
> Surely it's not equal to miniq1 or miniq2 if it's NULL.

I agree, that looks simpler and functionally equivalent. It seems
off-topic for this patch though; I'm only touching that line to change
the indentation.

Thanks,
Seth