Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

From: Matthew Wilcox
Date: Mon Jul 16 2018 - 21:30:38 EST


On Mon, Jul 16, 2018 at 04:40:32PM -0700, Andrew Morton wrote:
> On Mon, 16 Jul 2018 05:41:15 -0700 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> > On Mon, Jul 16, 2018 at 11:09:01AM +0200, Michal Hocko wrote:
> > > On Fri 13-07-18 10:36:14, Dave Chinner wrote:
> > > [...]
> > > > By limiting the number of negative dentries in this case, internal
> > > > slab fragmentation is reduced such that reclaim cost never gets out
> > > > of control. While it appears to "fix" the symptoms, it doesn't
> > > > address the underlying problem. It is a partial solution at best but
> > > > at worst it's another opaque knob that nobody knows how or when to
> > > > tune.
> > >
> > > Would it help to put all the negative dentries into its own slab cache?
> >
> > Maybe the dcache should be more sensitive to its own needs. In __d_alloc,
> > it could check whether there are a high proportion of negative dentries
> > and start recycling some existing negative dentries.
>
> Well, yes.
>
> The proposed patchset adds all this background reclaiming. Problem is
> a) that background reclaiming sometimes can't keep up so a synchronous
> direct-reclaim was added on top and b) reclaiming dentries in the
> background will cause non-dentry-allocating tasks to suffer because of
> activity from the dentry-allocating tasks, which is inappropriate.

... and it's an awful lot of code (almost 600 lines!) to implement
something fairly conceptually simple.

> I expect a better design is something like
>
> __d_alloc()
> {
> ...
> while (too many dentries)
> call the dcache shrinker
> ...
> }
>
> and that's it. This way we have a hard upper limit and only the tasks
> which are creating dentries suffer the cost.

I think the "too many total dentries" is probably handled just fine
by the core MM. What the dentry cache needs to prevent is adding a
disproportionately large number of useless negative dentries.

So I'd rather see:

if (too_many_negative(nr_dentry, nr_dentry_neg))
reclaim_negative_dentries(16);
...

16 feels like a fairly natural batch size. I don't know what
too_many_negative() looks like. Maybe it's:

bool too_many_negative(unsigned int total, unsigned int neg)
{
if (neg < 100)
return false;
if (neg * 5 < total * 2)
return false;
return true;
}

but it could be almost arbitrarily complex. I do think it needs to
scale with the total number of dentries, not scale with memory size of
the machine or the number of CPUs or anything similar.