Re: [RFC PATCH] vfs: limit directory child dentry retention

From: Christian Brauner

Date: Tue Mar 31 2026 - 05:45:39 EST


On Tue, Mar 31, 2026 at 09:29:09AM +0800, Ian Kent wrote:
> If there's a very large number of children present in a directory dentry
> then the benifit from retaining stale child dentries for re-use can
> become ineffective. Even hashed lookup can become ineffective as hash
> chains grow, time taken to umount a file system can increase a lot, as
> well as child dentry traversals resulting in lock held too long log
> messages.

Fwiw, there's also e6957c99dca5 ("vfs: Add a sysctl for automated deletion of dentry")

This patch introduces the concept conditionally, where the associated
dentry is deleted only when the user explicitly opts for it during file
removal. A new sysctl fs.automated_deletion_of_dentry is added for this
purpose. Its default value is set to 0.

I have no massive objections to your approach. It feels a bit hacky tbh
as it seems to degrade performance for new workloads in favor old
workloads. The LRU should sort this out though.

> But when a directory dentry has a very large number of children the
> parent dentry reference count is dominated by the contribution of its
> children. So it makes sense to not retain dentries if the parent
> reference count is large.
>
> Setting some large high water mark (eg. 500000) over which dentries
> are discarded instead of retained on final dput() would help a lot
> by preventing dentry caching contributing to the problem.
>
> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
> ---
> Documentation/admin-guide/sysctl/fs.rst | 7 +++++++
> fs/dcache.c | 28 +++++++++++++++++++++++++
> 2 files changed, 35 insertions(+)
>
> diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
> index 9b7f65c3efd8..7649254f2d0d 100644
> --- a/Documentation/admin-guide/sysctl/fs.rst
> +++ b/Documentation/admin-guide/sysctl/fs.rst
> @@ -75,6 +75,13 @@ negative dentries which do not map to any files. Instead,
> they help speeding up rejection of non-existing files provided
> by the users.
>
> +dir-stale-max
> +-------------
> +
> +Used to limit the number of stale child dentries retained in a
> +directory before the benifit of caching the dentry is negated by
> +the cost of traversing hash buckets during lookups or enumerating
> +the directory children. Initially set to 500000.
>
> file-max & file-nr
> ------------------
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 7ba1801d8132..298b4c3b1493 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -86,6 +86,14 @@ __cacheline_aligned_in_smp DEFINE_SEQLOCK(rename_lock);
>
> EXPORT_SYMBOL(rename_lock);
>
> +static long dsm_zero = 0;
> +static long dsm_max = ULONG_MAX/2;
> +
> +/* Highwater mark for number of stale entries in a directory (loosely
> + * measured by parent dentry reference count).
> + */
> +static unsigned long dir_stale_max __read_mostly = 500000;
> +
> static struct kmem_cache *__dentry_cache __ro_after_init;
> #define dentry_cache runtime_const_ptr(__dentry_cache)
>
> @@ -216,6 +224,15 @@ static const struct ctl_table fs_dcache_sysctls[] = {
> .extra1 = SYSCTL_ZERO,
> .extra2 = SYSCTL_ONE,
> },
> + {
> + .procname = "dir-stale-max",
> + .data = &dir_stale_max,
> + .maxlen = sizeof(dir_stale_max),
> + .mode = 0644,
> + .proc_handler = proc_doulongvec_minmax,
> + .extra1 = &dsm_zero,
> + .extra2 = &dsm_max,
> + },
> };
>
> static const struct ctl_table vm_dcache_sysctls[] = {
> @@ -768,6 +785,17 @@ static inline bool retain_dentry(struct dentry *dentry, bool locked)
> if (unlikely(d_flags & DCACHE_DONTCACHE))
> return false;
>
> + if (dir_stale_max) {
> + unsigned long p_count;
> +
> + // If the parent reference count is higher than some large value
> + // its dominated by the contribution of its children so there's
> + // no benefit caching the dentry over re-allocating it.
> + p_count = READ_ONCE(dentry->d_parent->d_lockref.count);
> + if (unlikely(p_count > dir_stale_max))
> + return false;
> + }
> +
> // At this point it looks like we ought to keep it. We also might
> // need to do something - put it on LRU if it wasn't there already
> // and mark it referenced if it was on LRU, but not marked yet.
> --
> 2.53.0
>