Re: [PATCH - resend] VFS: use synchronize_rcu_expedited() in namespace_unlock()

From: NeilBrown
Date: Thu Oct 04 2018 - 22:54:15 EST


On Fri, Oct 05 2018, Al Viro wrote:

> On Fri, Oct 05, 2018 at 11:27:37AM +1000, NeilBrown wrote:
>>
>> The synchronize_rcu() in namespace_unlock() is called every time
>> a filesystem is unmounted. If a great many filesystems are mounted,
>> this can cause a noticable slow-down in, for example, system shutdown.
>>
>> The sequence:
>> mkdir -p /tmp/Mtest/{0..5000}
>> time for i in /tmp/Mtest/*; do mount -t tmpfs tmpfs $i ; done
>> time umount /tmp/Mtest/*
>>
>> on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and
>> 100 seconds to unmount them.
>>
>> Boot the same VM with 1 CPU and it takes 18 seconds to mount the
>> tmpfs filesystems, but only 36 to unmount.
>>
>> If we change the synchronize_rcu() to synchronize_rcu_expedited()
>> the umount time on a 4-cpu VM drop to 0.6 seconds
>>
>> I think this 200-fold speed up is worth the slightly high system
>> impact of using synchronize_rcu_expedited().
>>
>> Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> (from general rcu perspective)
>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>> ---
>>
>> I posted this last October, then again last November (cc:ing Linus)
>> Paul is happy enough with it, but no other response.
>> I'm hoping it can get applied this time....
>
> Umm... IIRC, the last one got sidetracked on the other thing in the series...
> <checks> that was s_anon stuff. I can live with this one; FWIW, what kind
> of load would trigger the impact of the change? Paul?

I think you would need a long sequence of umounts to notice anything.
What you would notice is substantially reduced wall-clock time, but
slightly increased CPU time.

The original bug report that lead to this patch was a system with "HUGE
direct automount maps (>23k at this point)".
Stopping autofs (during shutdown) took more minutes than seemed
reasonable.

I noticed it again just recently when working on a systemd issue. If
you mount thousands of filesystems in quick succession (ClearCase can do
this), systemd processes /proc/self/mountinfo constantly and slows down
the whole process. When I unmount my test filesystems (mount --bind
/etc /MNT/$1) it takes a similar amount of time, but now it isn't
systemd slowing things down (which is odd actually, I wonder why systemd
didn't notice..) but rather the synchronize_rcu() delays.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature