Re: [PATCH] futex: Prevent pi_state from double freeing in case of error

From: Thomas Gleixner
Date: Wed Dec 23 2015 - 13:08:46 EST


Bhuvanesh,

On Wed, 23 Dec 2015, Bhuvanesh wrote:
> Apologies for not putting the backtrace earlier.

No problem. Let's look at it.

> During our regression test of the kernel version 3.14, generated a
> warning in futex code and resulted in crash with the backtrace given
> below:
>
> WARNING: CPU: 0 PID: 1468 at fs/inode.c:399 ihold+0x40/0x48()
> Backtrace:
> [<80011b88>] (dump_backtrace) from [<80011d90>] (show_stack+0x18/0x1c)
> [<80011d78>] (show_stack) from [<805036b8>] (dump_stack+0x74/0xc0)
> [<80503644>] (dump_stack) from [<8002393c>] (warn_slowpath_common+0x70/0x94)
> [<800238cc>] (warn_slowpath_common) from [<80023a04>] (warn_slowpath_null+0x24/0x2c)
> [<800239e0>] (warn_slowpath_null) from [<80115004>] (ihold+0x40/0x48)
> [<80114fc4>] (ihold) from [<8007f33c>] (get_futex_key_refs+0x58/0x64)
> [<8007f2e4>] (get_futex_key_refs) from [<8007f524>] (get_futex_key+0x1dc/0x200)
> [<8007f348>] (get_futex_key) from [<80080048>] (futex_wake+0x4c/0x144)

That's a totally different code path. It comes from get_futex_key_refs() and
ihold() complains about inode refcount being less than 2. So the futex sits in
a memory mapped file.

> [<8007fffc>] (futex_wake) from [<800819dc>] (do_futex+0xf8/0x984)
> [<800818e4>] (do_futex) from [<80082358>] (SyS_futex+0xf0/0x15c)
> [<80082268>] (SyS_futex) from [<8000e060>] (ret_fast_syscall+0x0/0x50)

> Kernel BUG at 80114f5c [verbose debug info unavailable]
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> CPU: 1 PID: 826 Comm: mediaengine_out Tainted: G W 3.14.51-03408-gf4477ef #1
> task: a31c6d40 ti: 96ab2000 task.ti: 96ab2000
> PC is at clear_inode+0x5c/0x60

Here the shmem code triggers a bug in clear_inode(). I can't tell which one
exactly (there are a couple of them).

> LR is at preempt_count_sub+0xd8/0x104

> [<80114f00>] (clear_inode) from [<800d5e68>] (shmem_evict_inode+0x12c/0x148)
> [<800d5d3c>] (shmem_evict_inode) from [<80115154>] (evict+0x9c/0x160)
> [<801150b8>] (evict) from [<80115bc4>] (iput+0x13c/0x144)
> [<80115a88>] (iput) from [<8010bae0>] (do_unlinkat+0x108/0x1c8)
> [<8010b9d8>] (do_unlinkat) from [<8010c694>] (SyS_unlink+0x18/0x1c)
> [<8010c67c>] (SyS_unlink) from [<8000e060>] (ret_fast_syscall+0x0/0x50)

The call comes from sys_unlink. So a file is removed. I can't tell whether
this is related, but it probably is.

> We observed the above issue thrice in our testing. Unfortunately we
> don't know the usecase or steps which resulted in the above behavior,
> since the testing was random.

You might try to analyse the futex/mmap code of PID 1468. It might be
something when the process shuts down and tears down the map. That might give
you a hint how to reproduce the issue.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/