Re: [PATCH] proc/sysctl: drop unregistered stale dentries as soon as possible
From: Andrew Morton
Date: Wed Feb 08 2017 - 16:48:19 EST
On Wed, 08 Feb 2017 13:48:24 +0300 Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> wrote:
> Currently unregistering sysctl does not prune its dentries.
> Stale sysctl dentries could slowdown sysctl operations significantly.
>
> For example, command:
>
> # for i in {1..100000} ; do unshare -n -- sysctl -a &> /dev/null ; done
>
> creates a millions of stale denties around sysctls of loopback interface:
>
> # sysctl fs.dentry-state
> fs.dentry-state = 25812579 24724135 45 0 0 0
>
> All of them have matching names thus lookup have to scan though whole
> hash chain and call d_compare (proc_sys_compare) which checks them
> under system-wide spinlock (sysctl_lock).
>
> # time sysctl -a > /dev/null
> real 1m12.806s
> user 0m0.016s
> sys 1m12.400s
>
> Currently only memory reclaimer could remove this garbage.
> But without significant memory pressure this never happens.
>
> This patch detects stale dentry in proc_sys_compare and pretends that
> it has matching name - revalidation will kill it and lookup restarts.
> As a result each stale dentry will be seen only once and will not
> contaminate hash endlessly.
>
What are "stale" dentries? Unused dentries? If so, why doesn't the
creation of a new dentry immediately invalidate the old dentry with a
matching path? What do other filesystems do to prevent this issue?
IOW I'm wondering if this should be fixed in some other place. Al?
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -852,11 +852,19 @@ static int proc_sys_compare(const struct dentry *dentry,
> inode = d_inode_rcu(dentry);
> if (!inode)
> return 1;
> +
> + /*
> + * Stale dentry: we cannot invalidate it right here, instead we
> + * pretend that it matches and revalidation will kill it later.
> + */
> + head = rcu_dereference(PROC_I(inode)->sysctl);
> + if (head && head->unregistering)
> + return 0;
> +
> if (name->len != len)
> return 1;
> if (memcmp(name->name, str, len))
> return 1;
> - head = rcu_dereference(PROC_I(inode)->sysctl);
> return !head || !sysctl_is_seen(head);
> }
>