回复: [PATCH] mm,oom_reaper: avoid run queue_oom_reaper if task is not oom

From: gaoxu
Date: Thu Nov 23 2023 - 22:16:01 EST


On Thu, 24 Nov 2023 08:51 Michal Hocko <mhocko@xxxxxxxx> wrote:
> On Wed 22-11-23 12:46:44, gaoxu wrote:
>> The function queue_oom_reaper tests and sets tsk->signal->oom_mm->flags.
>> However, it is necessary to check if 'tsk' is an OOM victim before
>> executing 'queue_oom_reaper' because the variable may be NULL.
>>
>> We encountered such an issue, and the log is as follows:
>> [3701:11_see]Out of memory: Killed process 3154 (system_server)
>> total-vm:23662044kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB,
>> UID:1000 pgtables:4056kB oom_score_adj:-900
>
>> [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set null_pointer
>> [3701:11_see][RB/E]rb_sreason_str_set: sreason_str set unknown_addr
>
> What are these?
This is a log message that we added ourselves.

>> [3701:11_see]Unable to handle kernel NULL pointer dereference at
>> virtual address 0000000000000328 [3701:11_see]user pgtable: 4k pages,
>> 39-bit VAs, pgdp=00000000821de000 [3701:11_see][0000000000000328]
>> pgd=0000000000000000,
>> p4d=0000000000000000,pud=0000000000000000
>> [3701:11_see]tracing off
>> [3701:11_see]Internal error: Oops: 96000005 [#1] PREEMPT SMP
>> [3701:11_see]Call trace:
>> [3701:11_see] queue_oom_reaper+0x30/0x170
>
> Could you resolve this offset into the code line please?
Due to the additional code we added for log purposes, the line numbers may not correspond to the original Linux code.

static void queue_oom_reaper(struct task_struct *tsk)
{
/* mm is already queued? */
if (test_and_set_bit(MMF_OOM_REAP_QUEUED, &tsk->signal->oom_mm->flags)) //a null pointer exception occurred
return;
...
}
>> [3701:11_see] __oom_kill_process+0x590/0x860 [3701:11_see]
>> oom_kill_process+0x140/0x274 [3701:11_see] out_of_memory+0x2f4/0x54c
>> [3701:11_see] __alloc_pages_slowpath+0x5d8/0xaac
>> [3701:11_see] __alloc_pages+0x774/0x800 [3701:11_see]
>> wp_page_copy+0xc4/0x116c [3701:11_see] do_wp_page+0x4bc/0x6fc
>> [3701:11_see] handle_pte_fault+0x98/0x2a8 [3701:11_see]
>> __handle_mm_fault+0x368/0x700 [3701:11_see]
>> do_handle_mm_fault+0x160/0x2cc [3701:11_see] do_page_fault+0x3e0/0x818
>> [3701:11_see] do_mem_abort+0x68/0x17c [3701:11_see] el0_da+0x3c/0xa0
>> [3701:11_see] el0t_64_sync_handler+0xc4/0xec [3701:11_see]
>> el0t_64_sync+0x1b4/0x1b8 [3701:11_see]tracing off
>>
>> Signed-off-by: Gao Xu <gaoxu2@xxxxxxxxxxx>
>> ---
>> mm/oom_kill.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 9e6071fde..3754ab4b6
>> 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -984,7 +984,7 @@ static void __oom_kill_process(struct task_struct *victim, const char *message)
>> }
>> rcu_read_unlock();
>>
>> - if (can_oom_reap)
>> + if (can_oom_reap && tsk_is_oom_victim(victim))
>> queue_oom_reaper(victim);
>
> I do not understand. We always do send SIGKILL and call mark_oom_victim(victim); on victim task when reaching out here. How can tsk_is_oom_victim can ever be false?
This is a low-probability issue, as it only occurred once during the monkey testing.
I haven't been able to find the root cause either.

>>
>> mmdrop(mm);
>> --
>> 2.17.1
>>
>>
>
>--
> Michal Hocko
> SUSE Labs