Re: [PATCH] mm: terminate shrink_slab loop if signal is pending

From: Suren Baghdasaryan
Date: Wed Dec 06 2017 - 20:27:26 EST


>
> Some quantification of "quite time consuming" and "delay" would be
> interesting, please.
>

Unfortunately that depends on the implementation of the shrinkers
registered in the system including the ones from drivers. I've
captured traces showing delays of up to 100ms where the process with
pending SIGKILL is in direct memory reclaim and signal handling is
delayed because of that. I realize that it's not the fault of
shrink_slab_lmk() that some shrinkers take long time to shrink their
slabs (sometimes because of justifiable reasons and sometimes because
of a bug which has to be fixed) but this can be a safeguard against
such cases.
Couple shrinker examples that I found most time consuming are (most of
that 100ms delay is the result of the first two ones):

https://patchwork.kernel.org/patch/10096641/
The patch fixes dm-bufio shrinker which in certain conditions reclaims
only one buffer per scan making the shrinking process very
inefficient.

https://android.googlesource.com/kernel/msm/+/android-7.1.0_r0.2/drivers/gpu/msm/kgsl_pool.c#420
This example is from a driver where shrinker returns 0 instead of
SHRINK_STOP when it's unable to reclaim anymore. As a result when
total_scan in do_shrink_slab() is large this will cause multiple
scan_objects() calls with no memory being reclaimed. Patch for this
one is under review by the owners.

Shrinker that seems to be justifiably heavy is super_cache_scan()
inside fs/super.c. I have traces where it takes up to 4ms to complete.