Re: [PATCH v2 2/3] selftests: cgroup: refactor proactive reclaim code to reclaim_until()

From: Yosry Ahmed
Date: Thu Dec 01 2022 - 22:20:06 EST


On Wed, Nov 30, 2022 at 10:25 AM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
>
> On Wed, Nov 30, 2022 at 9:20 AM Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote:
> >
> > On Tue, Nov 29, 2022 at 11:42:31AM -0800, Yosry Ahmed wrote:
> > > On Wed, Nov 23, 2022 at 7:16 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 5:03 PM Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote:
> > > > >
> > > > > On Wed, Nov 23, 2022 at 09:21:31AM +0000, Yosry Ahmed wrote:
> > > > > > Refactor the code that drives writing to memory.reclaim (retrying, error
> > > > > > handling, etc) from test_memcg_reclaim() to a helper called
> > > > > > reclaim_until(), which proactively reclaims from a memcg until its
> > > > > > usage reaches a certain value.
> > > > > >
> > > > > > This will be used in a following patch in another test.
> > > > > >
> > > > > > Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > > > > > ---
> > > > > > .../selftests/cgroup/test_memcontrol.c | 85 +++++++++++--------
> > > > > > 1 file changed, 49 insertions(+), 36 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > > > index 8833359556f3..d4182e94945e 100644
> > > > > > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > > > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > > > @@ -645,6 +645,53 @@ static int test_memcg_max(const char *root)
> > > > > > return ret;
> > > > >
> > > > >
> > > > > The code below looks correct, but can be simplified a bit.
> > > > > And btw thank you for adding a test!
> > > > >
> > > > > Reviewed-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>
> > > > > (idk if you want invest your time in further simplication of this code,
> > > > > it was this way before this patch, so up to you).
> > > >
> > > > I don't "want" to, but the voices in my head won't shut up until I do so..
> > > >
> > > > Here's a patch that simplifies the code, I inlined it here to avoid
> > > > sending a new version. If it looks good to you, it can be squashed
> > > > into this patch or merged separately (whatever you and Andrew prefer).
> > > > I can also send it in a separate thread if preferred.
> > >
> > > Roman, any thoughts on this?
> > >
> > > >
> > > >
> > > > From 18c40d61dac05b33cfc9233b17979b54422ed7c5 Mon Sep 17 00:00:00 2001
> > > > From: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > > > Date: Thu, 24 Nov 2022 02:21:12 +0000
> > > > Subject: [PATCH] selftests: cgroup: simplify memcg reclaim code
> > > >
> > > > Simplify the code for the reclaim_until() helper used for memcg reclaim
> > > > through memory.reclaim.
> > > >
> > > > Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> > > > ---
> > > > .../selftests/cgroup/test_memcontrol.c | 65 ++++++++++---------
> > > > 1 file changed, 33 insertions(+), 32 deletions(-)
> > > >
> > > > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > index bac3b91f1579..2e2bde44a6f7 100644
> > > > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > > @@ -17,6 +17,7 @@
> > > > #include <netdb.h>
> > > > #include <errno.h>
> > > > #include <sys/mman.h>
> > > > +#include <limits.h>
> > > >
> > > > #include "../kselftest.h"
> > > > #include "cgroup_util.h"
> > > > @@ -656,51 +657,51 @@ static int test_memcg_max(const char *root)
> > > > return ret;
> > > > }
> > > >
> > > > -/* Reclaim from @memcg until usage reaches @goal_usage */
> > > > +/*
> > > > + * Reclaim from @memcg until usage reaches @goal_usage by writing to
> > > > + * memory.reclaim.
> > > > + *
> > > > + * This function will return false if the usage is already below the
> > > > + * goal.
> > > > + *
> > > > + * This function assumes that writing to memory.reclaim is the only
> > > > + * source of change in memory.current (no concurrent allocations or
> > > > + * reclaim).
> > > > + *
> > > > + * This function makes sure memory.reclaim is sane. It will return
> > > > + * false if memory.reclaim's error codes do not make sense, even if
> > > > + * the usage goal was satisfied.
> > > > + */
> > > > static bool reclaim_until(const char *memcg, long goal_usage)
> > > > {
> > > > char buf[64];
> > > > int retries = 5;
> > > > - int err;
> > > > + int err = INT_MAX;
> > > > long current, to_reclaim;
> > > >
> > > > - /* Nothing to do here */
> > > > - if (cg_read_long(memcg, "memory.current") <= goal_usage)
> > > > - return true;
> > > > -
> >
> > Hi Yosry!
> >
> > Thank you for working on this!
> > I feel like it's still way more complex than it can be.
> > How about something like this? (completely untested, treat is
> > as a pseudo-code).
>
> Thanks Roman!
>
> This looks much simpler, and it nicely and subtly catches the false
> negative case (where we return -EAGAIN but have actually reclaimed the
> requested amount), but I think it doesn't catch the false positive
> case (where memory.reclaim returns 0 but hasn't reclaimed enough
> memory). In this case I think we will just keep retrying and ignore
> the false positive?
>
> Maybe with the following added check?
>
> >
> >
> > {
> > ...
> > bool ret = false;
> >
> > for (retries = 5; retries > 0; retries--) {
> > current = cg_read_long(memcg, "memory.current");
> >
> > if (current <= goal) // replace with values_close?
> > break;
> else if (ret) { // false positive?
> ret = false;
> break;
> }
> >
> > to_reclaim = current - goal_usage;
> > snprintf(buf, sizeof(buf), "%ld", to_reclaim);
> > err = cg_write(memcg, "memory.reclaim", buf);
> > if (!err)
> > ret = true;
> > else if (err != -AGAIN)
> > break;
> > }
> >
> > return ret;
> > }
>
> Also, please let me know if you prefer that I send this cleanup in the
> same thread like the above, in a completely separate patch that
> depends on this series, or have it squashed into this patch in a v3.
>
> Thanks again!

I realized I missed a few folks in the CC of this version anyway. Sent
v3 with the suggested refactoring (+ the missing check for false
positives) squashed into this patch. Also included your review tags on
patches 1 & 3 (patch 2 was almost rewritten according to your
suggestions, so I dropped the review tag and added a suggested tag):

https://lore.kernel.org/lkml/20221202031512.1365483-1-yosryahmed@xxxxxxxxxx/

Thanks!