Re: [PATCH] VFS: use synchronize_rcu_expedited() in namespace_unlock()
From: Paul E. McKenney
Date: Thu Oct 26 2017 - 08:27:53 EST
On Thu, Oct 26, 2017 at 01:26:37PM +1100, NeilBrown wrote:
>
> The synchronize_rcu() in namespace_unlock() is called every time
> a filesystem is unmounted. If a great many filesystems are mounted,
> this can cause a noticable slow-down in, for example, system shutdown.
>
> The sequence:
> mkdir -p /tmp/Mtest/{0..5000}
> time for i in /tmp/Mtest/*; do mount -t tmpfs tmpfs $i ; done
> time umount /tmp/Mtest/*
>
> on a 4-cpu VM can report 8 seconds to mount the tmpfs filesystems, and
> 100 seconds to unmount them.
>
> Boot the same VM with 1 CPU and it takes 18 seconds to mount the
> tmpfs filesystems, but only 36 to unmount.
>
> If we change the synchronize_rcu() to synchronize_rcu_expedited()
> the umount time on a 4-cpu VM is 8 seconds to mount and 0.6 to
> unmount.
>
> I think this 200-fold speed up is worth the slightly higher system
> impact of use synchronize_rcu_expedited().
>
> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> ---
>
> Cc: to Paul and Josh in case they'll correct me if using _expedited()
> is really bad here.
I suspect that filesystem unmount is pretty rare in production real-time
workloads, which are the ones that might care. So I would guess that
this is OK.
If the real-time guys ever do want to do filesystem unmounts while their
real-time applications are running, they might modify this so that it can
use synchronize_rcu() instead for real-time builds of the kernel.
But just for completeness, one way to make this work across the board
might be to instead use call_rcu(), with the callback function kicking
off a workqueue handler to do the rest of the unmount. Of course,
in saying that, I am ignoring any mutexes that you might be holding
across this whole thing, and also ignoring any problems that might arise
when returning to userspace with some portion of the unmount operation
still pending. (For example, someone unmounting a filesystem and then
immediately remounting that same filesystem.)
Thanx, Paul
> Thanks,
> NeilBrown
>
>
> fs/namespace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 3b601f115b6c..fce91c447fab 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -1420,7 +1420,7 @@ static void namespace_unlock(void)
> if (likely(hlist_empty(&head)))
> return;
>
> - synchronize_rcu();
> + synchronize_rcu_expedited();
>
> group_pin_kill(&head);
> }
> --
> 2.14.0.rc0.dirty
>