Re: [RFC] oom-kill: give the dying task a higher priority

From: Minchan Kim
Date: Wed Jun 02 2010 - 20:52:58 EST


On Thu, Jun 3, 2010 at 8:36 AM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> On Wed, 2 Jun 2010, KOSAKI Motohiro wrote:
>>
>> > > > @@ -291,9 +309,10 @@ static struct task_struct *select_bad_process(unsigned long *ppoints,
>> > > > Â Â Â Â Â Â Â Â Â* Otherwise we could get an easy OOM deadlock.
>> > > > Â Â Â Â Â Â Â Â Â*/
>> > > > Â Â Â Â Â Â Â Â if (p->flags & PF_EXITING) {
>> > > > - Â Â Â Â Â Â Â Â Â Â Â if (p != current)
>> > > > + Â Â Â Â Â Â Â Â Â Â Â if (p != current) {
>> > > > + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â boost_dying_task_prio(p, mem);
>> > > > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return ERR_PTR(-1UL);
>> > > > -
>> > > > + Â Â Â Â Â Â Â Â Â Â Â }
>> > > > Â Â Â Â Â Â Â Â Â Â Â Â chosen = p;
>> > > > Â Â Â Â Â Â Â Â Â Â Â Â *ppoints = ULONG_MAX;
>> > > > Â Â Â Â Â Â Â Â }
>> > >
>> > > This has the potential to actually make it harder to free memory if p is
>> > > waiting to acquire a writelock on mm->mmap_sem in the exit path while the
>> > > thread holding mm->mmap_sem is trying to run.
>> >
>> > if p is waiting, changing prio have no effect. It continue tol wait to release mmap_sem.
>> >
>>
>> And that can reduce the runtime of the thread holding a writelock on
>> mm->mmap_sem, making the exit actually take longer than without the patch
>> if its priority is significantly higher, especially on smaller machines.
>
> If p need mmap_sem, p is going to sleep to wait mmap_sem. if p doesn't,
> quickly exit is good thing. In other word, task fairness is not our goal
> when oom occur.
>

Tend to agree. I didn't agree boosting of whole threads' priority.

Task fairness VS system hang is trade off. task fairness is best
effort but system hang is critical.
Also, we have tried to it.

/*
* We give our sacrificial lamb high priority and access to
* all the memory it needs. That way it should be able to
* exit() and clear out its resources quickly...
*/
p->rt.time_slice = HZ;
set_tsk_thread_flag(p, TIF_MEMDIE);

But I think above code is meaningless unless p use SCHED_RR.
So boosting of lowest RT priority with FIFO is to meet above comment's
goal, I think.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/