Re: [patch 0/5] lightweight robust futexes: -V1

From: Andrew Morton
Date: Wed Feb 15 2006 - 16:45:19 EST


Ingo Molnar <mingo@xxxxxxx> wrote:
>
> ...
>
> E.g. in David Singleton's robust-futex-6.patch, there are 3 new syscall
> variants to sys_futex(): FUTEX_REGISTER, FUTEX_DEREGISTER and
> FUTEX_RECOVER. The kernel attaches such robust futexes to vmas (via
> vma->vm_file->f_mapping->robust_head), and at do_exit() time, all vmas
> are searched to see whether they have a robust_head set.

hm. What happened if the futex was in anonymous memory (vm_file==NULL)?

> New approach to robust futexes
> ------------------------------
>
> At the heart of this new approach there is a per-thread private list of
> robust locks that userspace is holding (maintained by glibc) - which
> userspace list is registered with the kernel via a new syscall [this
> registration happens at most once per thread lifetime]. At do_exit()
> time, the kernel checks this user-space list: are there any robust futex
> locks to be cleaned up?

Neat.

>
> ...
> The list is guaranteed to be private and per-thread, so it's lockless.
>

Why is that guaranteed?? Another thread could be scribbling on it while
the kernel is walking it?

Why use a list and not just a sparse array? (realloc() works..)

>
> There is one race possible though: since adding to and removing from the
> list is done after the futex is acquired by glibc, there is a few
> instructions window for the thread (or process) to die there, leaving
> the futex hung. To protect against this possibility, userspace (glibc)
> also maintains a simple per-thread 'list_op_pending' field, to allow the
> kernel to clean up if the thread dies after acquiring the lock, but just
> before it could have added itself to the list. Glibc sets this
> list_op_pending field before it tries to acquire the futex, and clears
> it after the list-add (or list-remove) has finished.

Oh. I'm surprised that glibc cannot just add the futex to the list prior
to acquiring it, then the exit-time code can work out whether the futex was
really taken-and-contended. Even if the kernel makes a mistake it either
won't find a futex there or it won't wake anyone up.


I think the patch breaks the build if CONFIG_FUTEX=n?

The patches are misordered - with only the first patch applied, the kernel
won't build. That's a nasty little landmine for git-bisect users.

Why do we need sys_get_robust_list(other task)?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/