Re: [PATCH] mm, oom: Fix race when selecting process to kill
From: Luigi Semenzato
Date: Tue Nov 05 2013 - 20:25:22 EST
It's not enough to hold a reference to the task struct, because it can
still be taken out of the circular list of threads. The RCU
assumptions don't hold in that case.
On Tue, Nov 5, 2013 at 5:18 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> On Tue, 5 Nov 2013, Sameer Nanda wrote:
>
>> The selection of the process to be killed happens in two spots -- first
>> in select_bad_process and then a further refinement by looking for
>> child processes in oom_kill_process. Since this is a two step process,
>> it is possible that the process selected by select_bad_process may get a
>> SIGKILL just before oom_kill_process executes. If this were to happen,
>> __unhash_process deletes this process from the thread_group list. This
>> then results in oom_kill_process getting stuck in an infinite loop when
>> traversing the thread_group list of the selected process.
>>
>> Fix this race by holding the tasklist_lock across the calls to both
>> select_bad_process and oom_kill_process.
>>
>> Change-Id: I8f96b106b3257b5c103d6497bac7f04f4dff4e60
>> Signed-off-by: Sameer Nanda <snanda@xxxxxxxxxxxx>
>
> Nack, we had to avoid taking tasklist_lock for this duration since it
> stalls out forks and exits on other cpus trying to take the writeside with
> irqs disabled to avoid watchdog problems.
>
> What kernel version are you patching? If you check the latest Linus tree,
> we hold a reference to the task_struct of the chosen process before
> calling oom_kill_process() so the hypothesis would seem incorrect.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/