Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2

From: Daisuke Nishimura
Date: Tue Mar 02 2010 - 01:20:47 EST

Next message: Tony Lindgren: "Re: linux-next: manual merge of the omap tree with the tree"
Previous message: David Miller: "Re: [PATCH] perf_events: add sampling period randomization support"
In reply to: KAMEZAWA Hiroyuki: "Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2"
Next in thread: Daisuke Nishimura: "Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, 2 Mar 2010 14:56:44 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> On Tue, 2 Mar 2010 14:37:38 +0900
> Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx> wrote:
>
> > On Tue, 2 Mar 2010 13:55:24 +0900, KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > > Very sorry, mutex_lock is called after prepare_to_wait.
> > > This is a fixed one.
> > I'm willing to test your patch, but I have one concern.
> >
> > > +/*
> > > + * try to call OOM killer. returns false if we should exit memory-reclaim loop.
> > > + */
> > > +bool mem_cgroup_handle_oom(struct mem_cgroup *mem, gfp_t mask)
> > > {
> > > - mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
> > > + DEFINE_WAIT(wait);
> > > + bool locked;
> > > +
> > > + /* At first, try to OOM lock hierarchy under mem.*/
> > > + mutex_lock(&memcg_oom_mutex);
> > > + locked = mem_cgroup_oom_lock(mem);
> > > + if (!locked)
> > > + prepare_to_wait(&memcg_oom_waitq, &wait, TASK_INTERRUPTIBLE);
> > > + mutex_unlock(&memcg_oom_mutex);
> > > +
> > > + if (locked)
> > > + mem_cgroup_out_of_memory(mem, mask);
> > > + else {
> > > + schedule();
> > > + finish_wait(&memcg_oom_waitq, &wait);
> > > + }
> > > + mutex_lock(&memcg_oom_mutex);
> > > + mem_cgroup_oom_unlock(mem);
> > > + /* TODO: more fine grained waitq ? */
> > > + wake_up_all(&memcg_oom_waitq);
> > > + mutex_unlock(&memcg_oom_mutex);
> > > +
> > > + if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> > > + return false;
> > > + /* Give chance to dying process */
> > > + schedule_timeout(1);
> > > + return true;
> > > }
> > >
> > Isn't there such race conditions ?
> >
> > context A context B
> > mutex_lock(&memcg_oom_mutex)
> > mem_cgroup_oom_lock()
> > ->success
> > mutex_unlock(&memcg_oom_mutex)
> > mem_cgroup_out_of_memory()
> > mutex_lock(&memcg_oom_mutex)
> > mem_cgroup_oom_lock()
> > ->fail
> > prepare_to_wait()
> > mutex_unlock(&memcg_oom_mutex)
> > mutex_lock(&memcg_oom_mutex)
> > mem_cgroup_oom_unlock()
> > wake_up_all()
> > mutex_unlocklock(&memcg_oom_mutex)
> > schedule()
> > finish_wait()
> >
> > In this case, context B will not be waken up, right?
> >
>
> No.
> prerape_to_wait();
> schedule();
> finish_wait();
> call sequence is for this kind of waiting.
>
>
> 1. Thread B. call prepare_to_wait(), then, wait is queued and task's status
> is changed to be TASK_INTERRUPTIBLE
> 2. Thread A. wake_up_all() check all waiters in queue and change their status
> to be TASK_RUNNING.
> 3. Thread B. calles schedule() but it's status is TASK_RUNNING,
> it will be scheduled soon, no sleep.
>
Ah, you're right. I forgot the point 2.
Thank you for your clarification.

I'll test this patch all through this night, and check whether it doesn't trigger
global oom after memcg's oom.

Thanks,
Daisuke Nishimura.

> Then, mutex_lock after prepare_to_wait() is bad ;)
>
> Thanks,
> -Kame
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tony Lindgren: "Re: linux-next: manual merge of the omap tree with the tree"
Previous message: David Miller: "Re: [PATCH] perf_events: add sampling period randomization support"
In reply to: KAMEZAWA Hiroyuki: "Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2"
Next in thread: Daisuke Nishimura: "Re: [BUGFIX][PATCH] memcg: fix oom kill behavior v2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]