Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks

From: Michal Hocko
Date: Mon Sep 13 2021 - 04:43:17 EST


On Mon 13-09-21 11:29:37, Vasily Averin wrote:
> On 9/10/21 5:55 PM, Michal Hocko wrote:
> > On Fri 10-09-21 16:20:58, Vasily Averin wrote:
> >> On 9/10/21 4:04 PM, Tetsuo Handa wrote:
> >>> Can't we add fatal_signal_pending(current) test to vmalloc() loop?
> >
> > We can and we should.
> >
> >> 1) this has been done in the past but has been reverted later.
> >
> > The reason for that should be addressed IIRC.
>
> I don't know the details of this, and I need some time to investigate it.

b8c8a338f75e ("Revert "vmalloc: back off when the current task is killed"")
should give a good insight and references.

> >> 2) any vmalloc changes will affect non-memcg allocations too.
> >> If we're doing memcg-related checks it's better to do it in one place.
> >
> > I think those two things are just orthogonal. Bailing out from vmalloc
> > early sounds reasonable to me on its own. Allocating a large thing that
> > is likely to go away with the allocating context is just a waste of
> > resources and potential reason to disruptions to others.
>
> I doubt that fatal signal should block any vmalloc allocations.
> I assume there are situations where rollback of some cancelled operation uses vmalloc.
> Or coredump saving on some remote storage can uses vmalloc.

If there really are any such requirements then this should be really
documented.

> However for me it's abnormal that even OOM-killer cannot cancel huge vmalloc allocation.
> So I think tsk_is_oom_victim(current) check should be added to vm_area_alloc_pages()
> to break vmalloc cycle.

Why should oom killed task behave any different than any other task
killed without a way to handle the signal?

--
Michal Hocko
SUSE Labs