Re: [RFC 4/4] net/ipv4/fib: Don't synchronise_rcu() every 512Kb

From: David Ahern
Date: Tue Mar 26 2019 - 11:39:37 EST


On 3/26/19 9:30 AM, Dmitry Safonov wrote:
> Fib trie has a hard-coded sync_pages limit to call synchronise_rcu().
> The limit is 128 pages or 512Kb (considering common case with 4Kb
> pages).
>
> Unfortunately, at Arista we have use-scenarios with full view software
> forwarding. At the scale of 100K and more routes even on 2 core boxes
> the hard-coded limit starts actively shooting in the leg: lockup
> detector notices that rtnl_lock is held for seconds.
> First reason is previously broken MAX_WORK, that didn't limit pending
> balancing work. While fixing it, I've noticed that the bottle-neck is
> actually in the number of synchronise_rcu() calls.
>
> I've tried to fix it with a patch to decrement number of tnodes in rcu
> callback, but it hasn't much affected performance.
>
> One possible way to "fix" it - provide another sysctl to control
> sync_pages, but in my POV it's nasty - exposing another realisation
> detail into user-space.

well, that was accepted last week. ;-)

commit 9ab948a91b2c2abc8e82845c0e61f4b1683e3a4f
Author: David Ahern <dsahern@xxxxxxxxx>
Date: Wed Mar 20 09:18:59 2019 -0700

ipv4: Allow amount of dirty memory from fib resizing to be controllable


Can you see how that change (should backport easily) affects your test
case? From my perspective 16MB was the sweet spot.