Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)

From: Kees Cook
Date: Tue Apr 16 2019 - 23:38:31 EST


On Mon, Apr 15, 2019 at 7:58 AM Jitendra Sharma <shajit@xxxxxxxxxxxxxx> wrote:
>
> Hi Kees Cook/Luis,
>
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
>
> -000|has_group_leader_pid(inline)
> -000|next_tgid(
> | [X20] ns = 0xFFFFFF87CABB1AC0,
> | [locdesc] iter = (
> | [locdesc] tgid = 424,
> | [locdesc] task = ?))
> | [X21] p = 0xFFFFFFD0FFFFF948
> | [X21] task = 0xFFFFFFD0FFFFF948
> -001|proc_pid_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> | [X21] ns = 0xFFFFFF87CABB1AC0
> -002|proc_root_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> -003|iterate_dir(
> | [X19] file = 0xFFFFFFD1AC60FC40,
> | [X22] ctx = 0xFFFFFF8027363E40)
> | [X23] inode = 0xFFFFFFD1F20246D0
> -004|SYSC_getdents64(inline)
> -004|sys_getdents64(
> | ?,
> | ?,
> | [X19] count = 4200)
> | [X19] count = 4200
> | [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
> | [X0] error = -1720
> -005|el0_svc_naked(asm)
> -->|exception
> -006|NUX:0x78C5AD7D38(asm)
> ---|end of frame
>
>
> From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid.
> As(from ramdumps) it doesn't have any valid fields. And while trying to
> access the fields of this task struct in has_group_leader_pid, abort is
> happening.
>
> From the dumps, its not clear why the task struct is coming to be some
> invalid (Possibly task has already exited). This issue is observed
> during normal monkey testing for long hours.
>
> Could you please provide some pointers which could help in debugging
> this issue further.

Do you have any hints on how to reproduce this? I assume something is
missing proper locking or RCU handling, but I don't see anything
obvious in the surrounding code yet...

--
Kees Cook