Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task

From: Kohli, Gaurav
Date: Wed Mar 07 2018 - 23:51:40 EST


On 3/8/2018 2:26 AM, David Rientjes wrote:

On Wed, 7 Mar 2018, Gaurav Kohli wrote:

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 6fd9773..5f4cc4b 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -114,9 +114,11 @@ struct task_struct *find_lock_task_mm(struct task_struct *p)
for_each_thread(p, t) {
task_lock(t);
+ get_task_struct(t);
if (likely(t->mm))
goto found;
task_unlock(t);
+ put_task_struct(t);
}
t = NULL;
found:
We hold rcu_read_lock() here, so perhaps only do get_task_struct() before
doing rcu_read_unlock() and we have a non-NULL t?

Here rcu_read_lock will not help, as our task may change due to below algo:

for_each_thread(p, t) {
task_lock(t);
+ get_task_struct(t);
if (likely(t->mm))
goto found;
task_unlock(t);
+ put_task_struct(t)


So only we can increase usage counter here only at the current task.

I have seen you new patch, that seems valid to me and it will resolve our issue.
Thanks for support.

Regards

Gaurav


--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.