Re: [PATCH] futex: Check for uaddr alignment as early as possible

From: Darren Hart
Date: Tue Dec 12 2017 - 10:57:31 EST


On Tue, Dec 12, 2017 at 11:31:02AM +0100, Ingo Molnar wrote:
>
> * Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> > On Fri, 8 Dec 2017, Darren Hart wrote:
> >
> > > From: "Darren Hart (VMware)" <dvhart@xxxxxxxxxxxxx>
> > >
> > > uaddr alignment is currently tested by get_futex_key(). We can catch
> > > misalignment earlier in sys_futex and return -EINVAL sooner. This
> > > simplifies get_futex_key() a little, but more importantly exits the
> > > kernel as soon as an invalid parameter is detected.
> > >
> > > Passes all selftests/futex testcases on a dual socket Xeon E5-2670, 16
> > > physical cores total, 32 threads total.
> > >
> > > Signed-off-by: Darren Hart (VMware) <dvhart@xxxxxxxxxxxxx>
> > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Cc: Darren Hart <dvhart@xxxxxxxxxxxxx>
> > > ---
> > > kernel/futex.c | 7 +++++--
> > > 1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/futex.c b/kernel/futex.c
> > > index 76ed592..c3ee6c4 100644
> > > --- a/kernel/futex.c
> > > +++ b/kernel/futex.c
> > > @@ -509,8 +509,6 @@ get_futex_key(u32 __user *uaddr, int fshared, union futex_key *key, int rw)
> > > * The futex address must be "naturally" aligned.
> > > */
> > > key->both.offset = address % PAGE_SIZE;
> > > - if (unlikely((address % sizeof(u32)) != 0))
> > > - return -EINVAL;
> > > address -= key->both.offset;
> > >
> > > if (unlikely(!access_ok(rw, uaddr, sizeof(u32))))
> > > @@ -3525,6 +3523,11 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
> > > u32 val2 = 0;
> > > int cmd = op & FUTEX_CMD_MASK;
> > >
> > > + /* Only allow for aligned uaddr variables */
> > > + if (unlikely((unsigned long)uaddr % sizeof(u32) != 0 ||
> > > + (unsigned long)uaddr2 % sizeof(u32) != 0))
> >
> > Errm. How is that supposed to work? uaddr2 is not used by all opcodes.....
>
> So to explain the curious timing of the mails from Thomas and me: I told Thomas
> about the breakage over IRC and he found the likely bug! ;-)

My thinking had been that while uaddr2 is not use by all the op-codes,
it is not ever used as anything but a userspace address and when it
isn't used, it shouldn't be getting garbage passed in.

So if it's failing due to a uaddr2 not being aligned for an op-code that
doesn't use it... that would have to be on FUTEX_WAIT*, FUTEX_WAKE,
FUTEX_WAKE_BITSET, or FUTEX_*_PI.

If it's failing on distro boot, the PI ops are unlikely candidates.

I'm curious what a valid use of this would be, and why my own tests
didn't catch it.

I can move the uaddr2 test under a conditional on the cmd for now. Would
that be acceptable?

Then I can add a WARNON for the other ops for my own testing, but not
for upstream.

--
Darren Hart
VMware Open Source Technology Center