Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()

From: Michal Koutný
Date: Tue May 10 2022 - 13:43:49 EST


Hello all.

On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> So I think we're OK with [2/5] now. Unless there be objections, I'll
> be looking to get this series into mm-stable later this week.

I'm sorry, I think the current form of the test reveals an unexpected
behavior of reclaim and silencing the test is not the way to go.
Although, I may be convinced that my understanding is wrong.


On Mon, May 09, 2022 at 11:09:15AM -0400, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> My understanding of the issue you're raising, Michal, is that
> protected siblings start with current > low, then get reclaimed
> slightly too much and end up with current < low. This results in a
> tiny bit of float that then gets assigned to the low=0 sibling;

Up until here, we're on the same page.

> when that sibling gets reclaimed regardless, it sees a low event.
> Correct me if I missed a detail or nuance here.

Here, I'd like to stress that the event itself is just a messenger (whom
my original RFC patch attempted to get rid of). The problem is that if
the sibling with recursive protection is active enough to claim it, it's
effectively stolen from the passive sibling. See the comparison of
'precious' vs 'victim' in [1].

> But unused float going to siblings is intentional. This is documented
> in point 3 in the comment above effective_protection(): if you use
> less than you're legitimately claiming, the float goes to your
> siblings.

The problem is how the unused protection came to be (voluntarily not
consumed vs reclaimed).

> So the problem doesn't seem to be with low accounting and
> event generation, but rather it's simply overreclaim.

Exactly.

> It's conceivable to make reclaim more precise and then tighten up the
> test. But right now, David's patch looks correct to me.

The obvious fix is at the end of this message, it resolves the case I
posted earlier (with memory_recursiveprot), however, it "breaks"
memory.events:low accounting inside recursive children, hence I'm not
considering it finished. (I may elaborate on the breaking case if
interested, I also need to look more into that myself).


On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet <void@xxxxxxxxxxxxx> wrote:
> If you look at how much memory A/B/E gets at the end of the reclaim,
> it's still far less than 1MB (though should it be 0?).

This selftest has two ±equal workloads in siblings, however, if their
activity varies, it can end up even opposite (the example [1]).

> This definitely sounds to me like a useful testcase to add, and I'm
> happy to do so in a follow-on patch. If we added this, do you think
> we need to keep the check for memory.low events for the memory.low ==
> 0 child in the overcommit testcase?

I think it's still useful, to check the behavior when inherited vs
explicit siblings coexist under protected parent.
Actually, the second case of all siblings having the inherited
(implicit) protection is also interesting (it seems that's that I'm
seeing in my tests with the attached patch).

+Cc: Chris, who reasoned about the SWAP_CLUSTER_MAX rounding vs too high
priority (too low numerically IIUC) [2].

Michal

[1] https://lore.kernel.org/r/20220325103118.GC2828@xxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/all/20190128214213.GB15349@xxxxxxxxxxxxxx/

--- 8< ---