Re: [PATCH - resend] VFS: use synchronize_rcu_expedited() in namespace_unlock()
From: Paul E. McKenney
Date: Fri Oct 05 2018 - 00:08:24 EST
On Fri, Oct 05, 2018 at 02:40:02AM +0100, Al Viro wrote:
> On Fri, Oct 05, 2018 at 11:27:37AM +1000, NeilBrown wrote:
> >
> > The synchronize_rcu() in namespace_unlock() is called every time
> > a filesystem is unmounted. If a great many filesystems are mounted,
> > this can cause a noticable slow-down in, for example, system shutdown.
> >
> > The sequence:
> > mkdir -p /tmp/Mtest/{0..5000}
> > time for i in /tmp/Mtest/*; do mount -t tmpfs tmpfs $i ; done
> > time umount /tmp/Mtest/*
> >
> > on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and
> > 100 seconds to unmount them.
> >
> > Boot the same VM with 1 CPU and it takes 18 seconds to mount the
> > tmpfs filesystems, but only 36 to unmount.
> >
> > If we change the synchronize_rcu() to synchronize_rcu_expedited()
> > the umount time on a 4-cpu VM drop to 0.6 seconds
> >
> > I think this 200-fold speed up is worth the slightly high system
> > impact of using synchronize_rcu_expedited().
> >
> > Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> (from general rcu perspective)
> > Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> > ---
> >
> > I posted this last October, then again last November (cc:ing Linus)
> > Paul is happy enough with it, but no other response.
> > I'm hoping it can get applied this time....
>
> Umm... IIRC, the last one got sidetracked on the other thing in the series...
> <checks> that was s_anon stuff. I can live with this one; FWIW, what kind
> of load would trigger the impact of the change? Paul?
You lost me with "what kind of load would trigger the impact of the
change?", but if you are asking about the downside, that would be IPIs
sent from each call to synchronize_rcu_expedited(). But people with
things like real-time workloads that therefore don't like those IPIs
have a number of options:
1. Boot with rcupdate.rcu_normal=1, which converts all calls to
synchronize_rcu_expedited() to synchronize_rcu(). This of
course loses the performance gain, but this can be a good
tradeoff for real-time workloads.
2. Build with CONFIG_NO_HZ_FULL=y and boot with nohz_full= to
cover the CPUs running your real-time workload. Then
as long as there is only one runnable usermode task per
nohz_full CPU, synchronize_rcu_expedited() will avoid sending
IPIs to any of the nohz_full CPUs.
3. Don't do unmounts while your real-time application is running.
Probably other options as well, but those are the ones that come
immediately to mind.
If I missed the point of your question, please help me understand
what you are asking for.
Thanx, Paul