Re: Make RCU tree CPU topology aware?
From: Paul E. McKenney
Date: Mon Aug 17 2015 - 11:28:27 EST
On Mon, Aug 17, 2015 at 11:39:34AM +0100, Alexander Gordeev wrote:
> Hi Paul,
>
> Currently RCU tree distributes CPUs to leafs based on consequent CPU
> IDs. That means CPUs from remote caches and even nodes might end up
> in the same leaf.
>
> I did not research the impact, but at the glance that seems at least
> sub-optimal; especially in case of remote nodes, when CPUs access
> each others' memory?
>
> I am thinking of topology-aware RCU geometry where the RCU tree reflects
> the actual system topology. I.e by borrowing it from schedulling domains
> or soemthing like that.
>
> Do you think it worth the effort to research this question or I am
> missing something and the current access patterns are just optimal?
The first thing to try would be to specify the rcutree.rcu_fanout_leaf
kernel boot parameter to align with the system's hardware boundaries and
to misalign, and see if you can measure any difference whatsoever at the
system level. For example, if you are using a multi-socket eight-core
x86 CPU with hyperthreading enabled, specify rcutree.rcu_fanout_leaf=8
to account for the "interesting" x86 CPU numbering. The default of
rcutree.rcu_fanout_leaf=16 would have the first two sockets sharing the
first leaf rcu_node structure. Perhaps also try rcutree.rcu_fanout_leaf=7
and rcutree.rcu_fanout_leaf=9 to tease out contention effects. I suggest
also running tests with hyperthreading disabled.
I bet that you won't see any system-level effect. The reason for that
bet is that people have been asking me this for years, but have always
declined to provide any data. In addition, RCU's fast paths are designed
to avoid hitting the rcu_node structures -- even call_rcu() normally is
confined to the per-CPU rcu_data structure.
Please note that I am particularly unhappy with the thought of having
RCU having non-contiguous CPU numbering within the rcu_node structures.
For example, having the first rcu_node structure have CPUs 0-7 and
32-39, the second have 8-15 and 40-47, and so on is really really ugly.
That isn't to say that I am inalterably opposed, but rather that there
had better be extremely good measurable system-level reasons for such
a change.
On the other hand, having some sort of option to allow architectures to
specify the RCU_FANOUT and RCU_FANOUT_LEAF values at boot time is not
that big a deal.
Does that help?
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/