Re: [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage

From: Roman Gushchin
Date: Wed Jul 26 2017 - 10:06:48 EST


On Wed, Jul 26, 2017 at 03:56:22PM +0200, Michal Hocko wrote:
> On Wed 26-07-17 14:27:15, Roman Gushchin wrote:
> [...]
> > @@ -656,13 +658,24 @@ static void mark_oom_victim(struct task_struct *tsk)
> > struct mm_struct *mm = tsk->mm;
> >
> > WARN_ON(oom_killer_disabled);
> > - /* OOM killer might race with memcg OOM */
> > - if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE))
> > +
> > + if (!cmpxchg(&tif_memdie_owner, NULL, current)) {
> > + struct task_struct *t;
> > +
> > + rcu_read_lock();
> > + for_each_thread(current, t)
> > + set_tsk_thread_flag(t, TIF_MEMDIE);
> > + rcu_read_unlock();
> > + }
>
> I would realy much rather see we limit the amount of memory reserves oom
> victims can consume rather than build on top of the current hackish
> approach of limiting the number of tasks because the fundamental problem
> is still there (a heavy multithreaded process can still deplete the
> reserves completely).
>
> Is there really any reason to not go with the existing patch I've
> pointed to the last time around? You didn't seem to have any objects
> back then.

Hi Michal!

I had this patch in mind and mentioned in the commit log, that TIF_MEMDIE
as an memory reserves access indicator will probably be eliminated later.

But that patch is not upstream yet, and it's directly related to the theme.
The proposed refactoring of TIF_MEMDIE usage is not against your approach,
and will not make harder to go this way further.

I'm slightly concerned about an idea to give TIF_MEMDIE to all tasks
in case we're killing a really large cgroup. But it's only a theoretical
concern, maybe it's fine.

So, I'd keep the existing approach for this patchset, and then we can follow
your approach and we will have a better test case for it.

Thanks!