Re: [PATCH] mm/memcontrol: stop resize loop if limit was changed again

From: Pavel Tikhomirov
Date: Wed Mar 20 2024 - 06:55:34 EST




On 20/03/2024 18:28, Michal Hocko wrote:
On Wed 20-03-24 18:03:30, Pavel Tikhomirov wrote:
In memory_max_write() we first set memcg->memory.max and only then
try to enforce it in loop. What if while we are in loop someone else
have changed memcg->memory.max but we are still trying to enforce
the old value? I believe this can lead to nasty consequence like getting
an oom on perfectly fine cgroup within it's limits or excess reclaim.

I would argue that uncoordinated hard limit configuration can cause
problems on their own.

Sorry, didn't know that.

Beside how is this any different from changing
the high limit while we are inside the reclaim loop?

I believe reclaim loop rereads limits on each iteration, e.g. in reclaim_high(), so it should always be enforcing the right limit.


We also have exactly the same thing in memory_high_write().

So let's stop enforcing old limits if we already have a new ones.

I do see any reasons why this would be harmful I just do not see why
this is a real thing or why the new behavior is any better for racing
updaters as those are not deterministic anyway. If you have any actual
usecase then more details would really help to justify this change.

The existing behavior makes some sense as it enforces the given limit
deterministically.

I don't have any actual problem, usecase or reproduce at hand, I only see a potential problem:

Let's imagine that:

a) We set cgroup max limit to some small value, memory_max_write updates memcg->memory.max and starts spinning in loop as it wants to reclaim some memory which does not fit in new limit.

b) We don't need small limit anymore and we raise the limit to a big value, but memory_max_write() from (a) is still spinning. And if we are lucky enough and processes of cgroup are constantly consuming memory, to compensate effect from memory_max_write() from (a), so that it will continue spinning there forever.

Yes it is not that bad, because memory_max/high_write() also constantly checks for pending signals in loop so they won't actually get irreversibly stuck. But I just thought it was worth fixing.


Signed-off-by: Pavel Tikhomirov <ptikhomirov@xxxxxxxxxxxxx>
---
mm/memcontrol.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 61932c9215e7..81b303728491 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6769,6 +6769,9 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
unsigned long nr_pages = page_counter_read(&memcg->memory);
unsigned long reclaimed;
+ if (memcg->memory.high != high)
+ break;
+
if (nr_pages <= high)
break;
@@ -6817,6 +6820,9 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
for (;;) {
unsigned long nr_pages = page_counter_read(&memcg->memory);
+ if (memcg->memory.max != max)
+ break;
+
if (nr_pages <= max)
break;
--
2.43.0


--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.