Re: general protection fault in oom_unkillable_task

From: Shakeel Butt
Date: Sat Jun 15 2019 - 12:16:39 EST


On Sat, Jun 15, 2019 at 6:50 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Fri 14-06-19 20:15:31, Shakeel Butt wrote:
> > On Fri, Jun 14, 2019 at 6:08 PM syzbot
> > <syzbot+d0fc9d3c166bc5e4a94b@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 3f310e51 Add linux-next specific files for 20190607
> > > git tree: linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15ab8771a00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=5d176e1849bbc45
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=d0fc9d3c166bc5e4a94b
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > >
> > > Unfortunately, I don't have any reproducer for this crash yet.
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+d0fc9d3c166bc5e4a94b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > kasan: CONFIG_KASAN_INLINE enabled
> > > kasan: GPF could be caused by NULL-ptr deref or user memory access
> > > general protection fault: 0000 [#1] PREEMPT SMP KASAN
> > > CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607
> > > #11
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline]
> > > RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline]
> >
> > It seems like oom_unkillable_task() is broken for memcg OOMs. It
> > should not be calling has_intersects_mems_allowed() for memcg OOMs.
>
> You are right. It doesn't really make much sense to check for the NUMA
> policy/cpusets when the memcg oom is NUMA agnostic. Now that I am
> looking at the code then I am really wondering why do we even call
> oom_unkillable_task from oom_badness. proc_oom_score shouldn't care
> about NUMA either.
>
> In other words the following should fix this unless I am missing
> something (task_in_mem_cgroup seems to be a relict from before the group
> oom handling). But please note that I am still not fully operation and
> laying in the bed.
>

Yes, we need something like this but not exactly.

> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5a58778c91d4..43eb479a5dc7 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -161,8 +161,8 @@ static bool oom_unkillable_task(struct task_struct *p,
> return true;
>
> /* When mem_cgroup_out_of_memory() and p is not member of the group */
> - if (memcg && !task_in_mem_cgroup(p, memcg))
> - return true;
> + if (memcg)
> + return false;

This will break the dump_tasks() usage of oom_unkillable_task(). We
can change dump_tasks() to traverse processes like
mem_cgroup_scan_tasks() for memcg OOMs.

>
> /* p may not have freeable memory in nodemask */
> if (!has_intersects_mems_allowed(p, nodemask))
> @@ -318,7 +318,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
> struct oom_control *oc = arg;
> unsigned long points;
>
> - if (oom_unkillable_task(task, NULL, oc->nodemask))
> + if (oom_unkillable_task(task, oc->memcg, oc->nodemask))
> goto next;
>
> --
> Michal Hocko
> SUSE Labs