Re: [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation

From: Paul E. McKenney
Date: Sat Aug 23 2008 - 22:44:18 EST


On Sat, Aug 23, 2008 at 06:07:35PM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > Is this a sufficient improvement?
>
> yeah - looks much better. This was the block that meets the eye for the
> first time in the patch so it stuck out.
>
> just one more small pet peeve of mine: please use vertical alignment too
> to improve readability. Instead of:
>
> > #define MAX_RCU_LEVELS 3
> > #define RCU_FANOUT (CONFIG_RCU_FANOUT)
> > #define RCU_FANOUT_SQ (RCU_FANOUT * RCU_FANOUT)
> > #define RCU_FANOUT_CUBE (RCU_FANOUT_SQ * RCU_FANOUT)
>
> this looks a bit more structured IMO:
>
> > #define MAX_RCU_LEVELS 3
> > #define RCU_FANOUT (CONFIG_RCU_FANOUT)
> > #define RCU_FANOUT_SQ (RCU_FANOUT * RCU_FANOUT)
> > #define RCU_FANOUT_CUBE (RCU_FANOUT_SQ * RCU_FANOUT)

Good point, fixed.

> maybe even this:
>
> > #if (NR_CPUS) <= RCU_FANOUT
> > # define NUM_RCU_LVLS 1
> > # define NUM_RCU_LVL_0 1
> > # define NUM_RCU_LVL_1 (NR_CPUS)
> > # define NUM_RCU_LVL_2 0
> > # define NUM_RCU_LVL_3 0
> > #elif (NR_CPUS) <= RCU_FANOUT_SQ
> > # define NUM_RCU_LVLS 2
> > # define NUM_RCU_LVL_0 1
> > # define NUM_RCU_LVL_1 (((NR_CPUS) + RCU_FANOUT - 1) / RCU_FANOUT)
> > # define NUM_RCU_LVL_2 (NR_CPUS)
> > # define NUM_RCU_LVL_3 0
> > #elif (NR_CPUS) <= RCU_FANOUT_CUBE
> > # define NUM_RCU_LVLS 3
> > # define NUM_RCU_LVL_0 1
> > # define NUM_RCU_LVL_1 (((NR_CPUS) + RCU_FANOUT_SQ - 1) / RCU_FANOUT_SQ)
> > # define NUM_RCU_LVL_2 (((NR_CPUS) + (RCU_FANOUT) - 1) / (RCU_FANOUT))
> > # define NUM_RCU_LVL_3 NR_CPUS
> > #else
> > # error "CONFIG_RCU_FANOUT insufficient for NR_CPUS"
> > #endif /* #if (NR_CPUS) <= RCU_FANOUT */
>
> but no strong feelings on that one. (maybe inserting a space at the
> right places helps too, no need for a full tab)

Yep, just like you, spaced it just enough to keep the longest one from
running over one line. ;-)

I left the definitions for RCU_SUM and NUM_RCU_NODES compact, though:

#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + NUM_RCU_LVL_2 + NUM_RCU_LVL_3)
#define NUM_RCU_NODES (RCU_SUM - NR_CPUS)

The other alternative would be to stack RCU_SUM as follows:

#define RCU_SUM (NUM_RCU_LVL_0 + NUM_RCU_LVL_1 + \
NUM_RCU_LVL_2 + NUM_RCU_LVL_3)

which seemed to me to add more ugly than enlightenment.

Testing is going well. Having to occasionally restrain myself to keep
from going full-bore for 4096 CPU optimality -- but have to keep it
simple until/unless someone with that large of a machine shows where
improvements are needed.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/