Re: [PATCH] futex: fix shared futex operations on nommu
From: Rich Felker
Date: Tue Apr 26 2016 - 18:32:14 EST
On Tue, Apr 26, 2016 at 03:27:10PM -0700, Andrew Morton wrote:
> On Tue, 26 Apr 2016 12:27:39 -0400 Rich Felker <dalias@xxxxxxxx> wrote:
>
> > On Tue, Apr 26, 2016 at 06:11:07PM +0200, Sebastian Andrzej Siewior wrote:
> > > * Rich Felker | 2016-04-26 11:53:44 [-0400]:
> > >
> > > >The whole shared futex logic is meaningless for nommu. Perhaps I
> > > >should have written a better message, though.
> > > >
> > > >With MMU, shared futex keys need to identify the physical backing for
> > > >a memory address because it may be mapped at different addresses in
> > > >different processes (or even multiple times in the same process).
> > > >Without MMU this cannot happen. You only have physical addresses. So
> > > >the "private futex" behavior of using the virtual address as the key
> > > >is always correct (for both shared and private cases) on nommu
> > > >systems.
> > >
> > > So using a shared futex on NOMMU does work but it would be more
> > > efficient to always use a private futex instead.
> > > Is this what you are saying?
> >
> > No. What I'm saying is that the current code paths for shared futex
> > are mmu-specific. They neither work (due to different mm internals, I
> > think) nor make sense (due to lack of virtual addresses that map to
> > the same physical address) on nommu.
> >
> > The private futex code paths are correct for either private or shared
> > futexes on nommu. This is both the natural theoretical prediction, and
> > confirmed by testing the patch.
>
> It is apparent from Sebastian's questioning that a code comment will be
> needed, please.
Indeed, I agree. I'll work on a better patch. At least this sufficed
to get discussion started.
> Also, what specifically is the runtime effect of the patch? Does the
> futex code presently misbehave on NOMMU when FUTEX_PRIVATE_FLAG is
> unset?
Without this patch, all futex ops without FUTEX_PRIVATE_FLAG fail with
EFAULT. It's been a while since I tracked down where the EFAULT is
generated but it's somewhere in the shared get-key vm logic.
If userspace treats this as an error, the corresponding pthread, etc.
functions fail. Otherwise, userspace just spins at 100% cpu retrying
FUTEX_WAIT and FUTEX_WAKE "works" fine as a nop against such a wait,
etc. (In a sense an always-failing implementation of futex is a
working implementation for the basic ops, just a highly suboptimal
one.)
Rich