Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes

From: Michal Hocko
Date: Mon Jan 15 2018 - 07:58:49 EST


On Mon 15-01-18 15:53:35, Andrey Ryabinin wrote:
>
>
> On 01/15/2018 03:46 PM, Michal Hocko wrote:
> > On Mon 15-01-18 15:30:59, Andrey Ryabinin wrote:
> >>
> >>
> >> On 01/12/2018 03:24 PM, Michal Hocko wrote:
> >>> On Fri 12-01-18 00:59:38, Andrey Ryabinin wrote:
> >>>> On 01/11/2018 07:29 PM, Michal Hocko wrote:
> >>> [...]
> >>>>> I do not think so. Consider that this reclaim races with other
> >>>>> reclaimers. Now you are reclaiming a large chunk so you might end up
> >>>>> reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over
> >>>>> reclaim to be negligible.
> >>>>>
> >>>>
> >>>> I did consider this. And I think, I already explained that sort of race in previous email.
> >>>> Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers,
> >>>> doesn't matter. That doesn't change anything.
> >>>
> >>> I would _really_ prefer two patches here. The first one removing the
> >>> hard coded reclaim count. That thing is just dubious at best. If you
> >>> _really_ think that the higher reclaim target is meaningfull then make
> >>> it a separate patch. I am not conviced but I will not nack it it either.
> >>> But it will make our life much easier if my over reclaim concern is
> >>> right and we will need to revert it. Conceptually those two changes are
> >>> independent anywa.
> >>>
> >>
> >> Ok, fair point. But what about livelock than? Don't you think that we should
> >> go back to something like in V1 patch to prevent it?
> >
> > I am not sure what do you mean by the livelock here.
> >
>
> Livelock is when tasks in cgroup constantly allocate reclaimable memory at high rate,
> and user asked to set too low unreachable limit e.g. 'echo 4096 > memory.limit_in_bytes'.

OK, I wasn't sure. The reclaim target, however, doesn't have a direct
influence on this, though.

> We will loop indefinitely in mem_cgroup_resize_limit(), because try_to_free_mem_cgroup_pages() != 0
> (as long as cgroup tasks generate new reclaimable pages fast enough).

I do not thing this is a real problem. The context is interruptible and
I would even consider it safer to keep retrying than simply failing
prematurely. My experience tells me that basically any hard coded retry
loop in the kernel is wrong.

--
Michal Hocko
SUSE Labs