[PATCH v4 0/6] fs/dcache: Limit # of negative dentries

From: Waiman Long
Date: Mon Sep 18 2017 - 14:21:53 EST

- Remove the config option "neg_dentry_pc" to reduce system admin
- Allow referenced negative dentries to be recycled in the list
instead of being killed in pruning.
- Enable auto-tuneup of negative dentry limit to match positive
dentry count.
- Remove the umount racing patch but take an active reference on
SB while pruning to prevent it from vanishing.
- Separate out the dentry_kill() code relocation in patch 1 to a
separate patch.
- Move the negative dentry tracking patch in front of the limiting
- Decrease the default negative dentry percentage from 5% to 2%.

- Add a faster pruning rate when the free pool is closed to depletion.
- As suggested by James Bottomley, add an artificial delay waiting
loop before killing a negative dentry and properly clear the
DCACHE_KILL_NEGATIVE flag if killing doesn't happen.
- Add a new patch to track number of negative dentries that are
forcifully killed.

- Move the new nr_negative field to the end of dentry_stat_t structure
as suggested by Matthew Wilcox.
- With the help of Miklos Szeredi, fix incorrect locking order in
dentry_kill() by using lock_parent() instead of locking the parent's
d_lock directly.
- Correctly account for positive to negative dentry transitions.
- Automatic pruning of negative dentries will now ignore the reference
bit in negative dentries but not the regular shrinking.

A rogue application can potentially create a large number of negative
dentries in the system consuming most of the memory available even if
memory controller is enabled to limit memory usage. This can impact
performance of other applications running on the system.

We have customers seeing soft lockup and unresponsive system when
tearing down a container because of the large number of negative
dentries accumulated during its up time that had to be cleaned up at
exit time when the container's filesystem was unmounted. So we need
to do something about it.

This patchset introduces changes to the dcache subsystem to limit
the number of negative dentries allowed to be created thus limiting
the amount of memory that can be consumed by negative dentries.

Patch 1 just relocates the postion of the dentry_kill() function.

Patch 2 tracks the number of negative dentries present in the LRU
lists and reports it in /proc/sys/fs/dentry-state.

Patch 3 sets a limit on the number of negative dentries allowable as a
small percentage (2%) of total system memory. So the larger the system,
the more negative dentries can be allowed. Once the limit is reached,
new negative dentries will be killed after use.

Patch 4 enables automatic pruning of least recently used negative
dentries when it is close to the limit so that we won't end up killing
recently used negative dentries.

Patch 5 shows the number of forced negative dentry killings in

Patch 6 enables auto-tuneup of free pool negative dentry count to
no more than the maximum number of positive dentries ever used.

With a 4.13 based kernel, the positive & negative dentries lookup rates
(lookups per second) after initial boot on a 36-core 50GB memory VM
with and without the patch were as follows:

Metric w/o patch with patch
------ --------- ----------
Positive dentry lookup 840269 845762
Negative dentry lookup 1903405 1962514
Negative dentry creation 6817957 6928768

The last row refers to the creation rate of 1 millions negative
dentries. With 50GB of memory, 1 millions negative dentries can be
created with the patched kernel without any pruning or dentry killing.

Ignoring some inherent noise in the test results, there wasn't any
noticeable difference in term of lookup and negative dentry creation
performance with or without this patch.

By creating 10 millions negative dentries, however, the performance

Metric w/o patch with patch
------ --------- ----------
Negative dentry creation 651663 190105

For the patched kernel, the corresponding dentry-state was:

1608833 1590416 45 0 1579878 8286952

This was expected as negative dentry creation throttling with forced
dentry deletion happened in this case.

Running the AIM7 high-systime workload on the same VM, the baseline
performance was 186770 jobs/min. By running a single-thread rogue
negative dentry creation program in the background until the patched
kernel with 2% limit started throttling, the performance was 183746
jobs/min. On an unpatched kernel with memory almost exhausted and
memory shrinker was kicked in, the performance was 148997 jobs/min.

So the patch does protect the system from suffering significant
performance degradation in case a negative dentry creation rogue
program is runninig in the background.

Waiman Long (6):
fs/dcache: Relocate dentry_kill() after lock_parent()
fs/dcache: Track & report number of negative dentries
fs/dcache: Limit numbers of negative dentries
fs/dcache: Enable automatic pruning of negative dentries
fs/dcache: Track count of negative dentries forcibly killed
fs/dcache: Autotuning of negative dentry limit

fs/dcache.c | 462 +++++++++++++++++++++++++++++++++++++++++++----
include/linux/dcache.h | 8 +-
include/linux/list_lru.h | 1 +
mm/list_lru.c | 4 +-
4 files changed, 439 insertions(+), 36 deletions(-)