Re: [RFC] oom-kill: give the dying task a higher priority

From: Minchan Kim
Date: Fri May 28 2010 - 10:06:36 EST


On Fri, May 28, 2010 at 09:53:05AM -0300, Luis Claudio R. Goncalves wrote:
> On Fri, May 28, 2010 at 02:59:02PM +0900, KOSAKI Motohiro wrote:
> | > RT Task
> | >
> | > void non-RT-function()
> | > {
> | > system call();
> | > buffer = malloc();
> | > memset(buffer);
> | > }
> | > /*
> | > * We make sure this function must be executed in some millisecond
> | > */
> | > void RT-function()
> | > {
> | > some calculation(); <- This doesn't have no dynamic characteristic
> | > }
> | > int main()
> | > {
> | > non-RT-function();
> | > /* This function make sure RT-function cannot preempt by others */
> | > set_RT_max_high_priority();
> | > RT-function A();
> | > set_normal_priority();
> | > non-RT-function();
> | > }
> | >
> | > We don't want realtime in whole function of the task. What we want is
> | > just RT-function A.
> | > Of course, current Linux cannot make perfectly sure RT-functionA can
> | > not preempt by others.
> | > That's because some interrupt or exception happen. But RT-function A
> | > doesn't related to any dynamic characteristic. What can justify to
> | > preempt RT-function A by other processes?
> |
> | As far as my observation, RT-function always have some syscall. because pure
> | calculation doesn't need deterministic guarantee. But _if_ you are really
> | using such priority design. I'm ok maximum NonRT priority instead maximum
> | RT priority too.
>
> I confess I failed to distinguish memcg OOM and system OOM and used "in
> case of OOM kill the selected task the faster you can" as the guideline.
> If the exit code path is short that shouldn't be a problem.
>
> Maybe the right way to go would be giving the dying task the biggest
> priority inside that memcg to be sure that it will be the next process from
> that memcg to be scheduled. Would that be reasonable?

Hmm. I can't understand your point.
What do you mean failing distinguish memcg and system OOM?

We already have been distinguish it by mem_cgroup_out_of_memory.
(but we have to enable CONFIG_CGROUP_MEM_RES_CTLR).
So task selected in select_bad_process is one out of memcg's tasks when
memcg have a memory pressure.

Isn't it enough?
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/