Re: [RFC PATCH 1/2] mm, oom: marks all killed tasks as oom victims

From: Tetsuo Handa
Date: Mon Oct 22 2018 - 06:57:12 EST


On 2018/10/22 19:43, Michal Hocko wrote:
> On Mon 22-10-18 18:42:30, Tetsuo Handa wrote:
>> On 2018/10/22 17:48, Michal Hocko wrote:
>>> On Mon 22-10-18 16:58:50, Tetsuo Handa wrote:
>>>> Michal Hocko wrote:
>>>>> --- a/mm/oom_kill.c
>>>>> +++ b/mm/oom_kill.c
>>>>> @@ -898,6 +898,7 @@ static void __oom_kill_process(struct task_struct *victim)
>>>>> if (unlikely(p->flags & PF_KTHREAD))
>>>>> continue;
>>>>> do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, PIDTYPE_TGID);
>>>>> + mark_oom_victim(p);
>>>>> }
>>>>> rcu_read_unlock();
>>>>>
>>>>> --
>>>>
>>>> Wrong. Either
>>>
>>> You are right. The mm might go away between process_shares_mm and here.
>>> While your find_lock_task_mm would be correct I believe we can do better
>>> by using the existing mm that we already have. I will make it a separate
>>> patch to clarity.
>>
>> Still wrong. p->mm == NULL means that we are too late to set TIF_MEMDIE
>> on that thread. Passing non-NULL mm to mark_oom_victim() won't help.
>
> Why would it be too late? Or in other words why would this be harmful?
>

Setting TIF_MEMDIE after exit_mm() completed is too late.

static void exit_mm(void)
{
(...snipped...)
task_lock(current);
current->mm = NULL;
up_read(&mm->mmap_sem);
enter_lazy_tlb(mm, current);
task_unlock(current);
mm_update_next_owner(mm);
mmput(mm);
if (test_thread_flag(TIF_MEMDIE))
exit_oom_victim();
}