Re: [PATCH 09/28] mm: directed shrinker work deferral

From: Brian Foster
Date: Mon Nov 04 2019 - 10:25:34 EST


On Fri, Nov 01, 2019 at 10:45:59AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@xxxxxxxxxx>
>
> Introduce a mechanism for ->count_objects() to indicate to the
> shrinker infrastructure that the reclaim context will not allow
> scanning work to be done and so the work it decides is necessary
> needs to be deferred.
>
> This simplifies the code by separating out the accounting of
> deferred work from the actual doing of the work, and allows better
> decisions to be made by the shrinekr control logic on what action it
> can take.
>
> Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> ---

My understanding from the previous discussion(s) is that this is not
tied directly to the gfp mask because that is not the only intended use.
While it is currently a boolean tied to the the entire shrinker call,
the longer term objective is per-object granularity.

I find the argument reasonable enough, but if the above is true, why do
we move these checks from ->scan_objects() to ->count_objects() (in the
next patch) when per-object decisions will ultimately need to be made by
the former? That seems like unnecessary churn and inconsistent with the
argument against just temporarily doing something like what Christoph
suggested in the previous version, particularly since IIRC the only use
in this series was for gfp mask purposes.

> include/linux/shrinker.h | 7 +++++++
> mm/vmscan.c | 8 ++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 0f80123650e2..3405c39ab92c 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -31,6 +31,13 @@ struct shrink_control {
>
> /* current memcg being shrunk (for memcg aware shrinkers) */
> struct mem_cgroup *memcg;
> +
> + /*
> + * set by ->count_objects if reclaim context prevents reclaim from
> + * occurring. This allows the shrinker to immediately defer all the
> + * work and not even attempt to scan the cache.
> + */
> + bool defer_work;
> };
>
> #define SHRINK_STOP (~0UL)
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ee4eecc7e1c2..a215d71d9d4b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -536,6 +536,13 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
> trace_mm_shrink_slab_start(shrinker, shrinkctl, nr,
> freeable, delta, total_scan, priority);
>
> + /*
> + * If the shrinker can't run (e.g. due to gfp_mask constraints), then
> + * defer the work to a context that can scan the cache.
> + */
> + if (shrinkctl->defer_work)
> + goto done;
> +

I still find the fact that this per-shrinker invocation field is never
reset unnecessarily fragile, and I don't see any good reason not to
reset it prior to the shrinker callback that potentially sets it.

Brian

> /*
> * Normally, we should not scan less than batch_size objects in one
> * pass to avoid too frequent shrinker calls, but if the slab has less
> @@ -570,6 +577,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
> cond_resched();
> }
>
> +done:
> if (next_deferred >= scanned)
> next_deferred -= scanned;
> else
> --
> 2.24.0.rc0
>