Re: [patch 0/6] lightweight robust futexes: -V3 - Why in userspace?
From: Andrew James Wade
Date: Fri Feb 17 2006 - 18:45:04 EST
[Resending: I accidentally did reply one.]
On Thursday 16 February 2006 18:20, Esben Nielsen wrote:
> On Thu, 16 Feb 2006, Ingo Molnar wrote:
>
> >
> > * Esben Nielsen <simlo@xxxxxxxxxx> wrote:
> >
> > > As I understand the protocol the userspace task writes it's pid into
> > > the lock atomically when locking it and erases it atomically when it
> > > leaves the lock. If it is killed inbetween the pid is still there. Now
> > > if another task comes along it reads the pid, sets the wait flag and
> > > goes into the kernel. The kernel will now be able to see that the pid
> > > is no longer valid and therefore the owner must be dead.
> >
> > this is racy - we cannot know whether the PID wrapped around.
> >
> What about adding more bits to check on? The PID to lookup the task_t and
> then some extra bits to uniquely identify the actual task.
The extra identifying bits don't even have to be written at the same
time/place as the PID. They can be written after the futex is aquired,
and cleared before the futex is released. A mechanism similar to
list_op_pending can be used to fill the races: If the extra-id field is
clear, the kernel checks all the list_op_pendings registered for that PID
to see if any of the threads is in the process of aquiring/releasing that
futex. If not, FUTEX_OWNER_DIED. This last process is liable to be quite
nasty and heavy-weight, but it should also be rare if the races are small.
I believe all the races in the last process can be closed if we can count
on FUTEX_WAITERS informing us (the testing process) if the futex is
released. For each thread that might be holding the futex, if
list_op_pending doesn't equal the futex, then that thread can't be aquiring
the futex. If the extra_id is still clear, then that thread can't be holding
the futex. If list_op_pending still doesn't equal the futex, then that
thread can't be freeing the futex. Therefore that thread doesn't have the
futex. So if no threads are holding the futex, and the futex hasn't been
released during this process, then the owner must be dead.
[This assumes that the freeing thread is the same as the one that aquired
the mutex. This assumption can be relaxed to the freeing thread
being merely in the same process, but beyond that a syscall would be
needed on the freeing side to avoid races.]
I hope this is clear, I'd need to get up to speed on kernel hacking before
I could turn this into code.
Andrew Wade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/