Re: commit cfafcd117 "futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()" causes glibc nptl/tst-robustpi8 failure
From: Peter Zijlstra
Date: Thu May 18 2017 - 07:43:52 EST
On Thu, May 18, 2017 at 10:34:34AM +0200, Florian Weimer wrote:
> On 05/18/2017 10:31 AM, Peter Zijlstra wrote:
> > But it does that after building the tst-robustpi8 thing, so I seem to
> > have all I need here.
>
> Great, have fun figuring out what's going on. :-/
ld-linux-x86-64-2165 [018] .... 290.235869: sched_process_fork: comm=ld-linux-x86-64 pid=2165 child_comm=ld-linux-x86-64 child_pid=2166
ld-linux-x86-64-2166 [019] .... 290.436398: handle_futex_death: 00007f066634e870: 876 -> 40000000
ld-linux-x86-64-2166 [019] .... 290.436399: handle_futex_death: 00007f066634e0c8: 876 -> 40000000
ld-linux-x86-64-2166 [019] .... 290.436400: handle_futex_death: 00007f066634ee38: 80000876 -> c0000000
ld-linux-x86-64-2166 [019] .... 290.436401: sched_process_exit: comm=ld-linux-x86-64 pid=2166 prio=120
ld-linux-x86-64-2164 [019] ...1 290.436546: attach_to_pi_owner: 2: 00007f066634e078 = 80000876
ld-linux-x86-64-2183 [026] .... 827.987914: sched_process_fork: comm=ld-linux-x86-64 pid=2183 child_comm=ld-linux-x86-64 child_pid=2187
ld-linux-x86-64-2187 [029] .... 828.188218: handle_futex_death: 00007f76dd361690: 88b -> 40000000
ld-linux-x86-64-2187 [029] .... 828.188219: handle_futex_death: 00007f76dd361898: 8000088b -> c0000000
ld-linux-x86-64-2187 [029] .... 828.188220: handle_futex_death: 00007f76dd3615c8: 8000088b -> c0000000
ld-linux-x86-64-2187 [029] .... 828.188220: handle_futex_death: 00007f76dd3612d0: 8000088b -> c0000000
ld-linux-x86-64-2187 [029] .... 828.188220: handle_futex_death: 00007f76dd361af0: 8000088b -> c0000000
ld-linux-x86-64-2187 [029] .... 828.188221: handle_futex_death: 00007f76dd361168: 8000088b -> c0000000
ld-linux-x86-64-2187 [029] .... 828.188222: sched_process_exit: comm=ld-linux-x86-64 pid=2187 prio=120
ld-linux-x86-64-2182 [019] ...1 828.188373: attach_to_pi_owner: 2: 00007f76dd361000 = 8000088b
In both cases we fail in FUTEX_LOCK_PI trying to acquire a futex owned
by a dead task, resulting in the -ESRCH.
Now, pthread_mutex_lock() isn't expecting -ESRCH for robust futexes,
because for robust we'd expect handle_futex_death() to clear out the
futex value and set OWNER_DIED, as can be seen above.
The problem is however that the futex address we fail on, doesn't appear
to have been fixed up, so its either not on the robust list, or the
robust list got broken.
I'll see if I can narrow that down a little more.