Re: Bug: fio traps into kernel without exiting because futex has adeadloop
From: Peter Zijlstra
Date: Thu Jun 11 2009 - 01:56:17 EST
On Thu, 2009-06-11 at 11:08 +0800, Zhang, Yanmin wrote:
> I investigate a fio hang issue. When I run fio multi-process
> testing on many disks, fio traps into kernel and doesn't exit
> (mostly hit once after runing sub test cases for ïhundreds of times).
>
> Oprofile data shows kernel consumes time with some futex functions.
> Command kill couldn't kill the process and machine reboot also hangs.
>
> Eventually, I locate the root cause as a bug of futex. Kernel enters
> a deadloop between 'retry' and 'goto retry' in function futex_wake_op.
> By unknown reason (might be an issue of fio or glibc), parameter uaddr2
> points to an area which is READONLY. So futex_atomic_op_inuser returns
> -EFAULT when trying to changing the data at uaddr2, but later get_user
> still succeeds becasue the area is READONLY. Then go back to retry.
>
> I create a simple test case to trigger it, which just shmat an READONLY
> area for address uaddr2.
>
> It could be used as a DOS attack.
commit 2070887fdeacd9c13f3e805e3f0086c9f22a4d93
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Tue May 19 23:04:59 2009 +0200
futex: fix restart in wait_requeue_pi
If the waiter has been requeued to the outer PI futex and is
interrupted by a signal and the thread handles the signal then
ERESTART_RESTARTBLOCK is changed to EINTR and the restart block is
discarded. That way we return an unexcpected EINTR to user space
instead of ending up in futex_lock_pi_restart.
But we do not need to restart the syscall because we know that the
condition has changed since we have been requeued. If we would simply
restart the syscall then we would drop out via the comparison of the
user space value with EWOULDBLOCK.
The user space side needs to handle EWOULDBLOCK anyway as the
enqueueing on the inner futex can race with a requeue/wake. So we can
simply return EWOULDBLOCK to user space which also signals that we did
not take the outer futex and let user space handle it in the same way
it has to handle the requeue/wake race.
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/