Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()

From: Al Viro
Date: Tue Oct 15 2019 - 14:08:54 EST

[futex folks and linux-arch Cc'd]

On Sun, Oct 13, 2019 at 08:59:49PM +0100, Al Viro wrote:

> Re plotting: how strongly would you object against passing the range to
> user_access_end()? Powerpc folks have a very close analogue of stac/clac,
> currently buried inside their __get_user()/__put_user()/etc. - the same
> places where x86 does, including futex.h and friends.
> And there it's even costlier than on x86. It would obviously be nice
> to lift it at least out of unsafe_get_user()/unsafe_put_user() and
> move into user_access_begin()/user_access_end(); unfortunately, in
> one subarchitecture they really want it the range on the user_access_end()
> side as well. That's obviously not fatal (they can bloody well save those
> into thread_info at user_access_begin()), but right now we have relatively
> few user_access_end() callers, so the interface changes are still possible.
> Other architectures with similar stuff are riscv (no arguments, same
> as for stac/clac), arm (uaccess_save_and_enable() on the way in,
> return value passed to uaccess_restore() on the way out) and s390
> (similar to arm, but there it's needed only to deal with nesting,
> and I'm not sure it actually can happen).
> It would be nice to settle the API while there are not too many users
> outside of arch/x86; changing it later will be a PITA and we definitely
> have architectures that do potentially costly things around the userland
> memory access; user_access_begin()/user_access_end() is in the right
> place to try and see if they fit there...

Another question: right now we have
if (!access_ok(uaddr, sizeof(u32)))
return -EFAULT;

ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
if (ret)
return ret;
in kernel/futex.c. Would there be any objections to moving access_ok()
inside the instances and moving pagefault_disable()/pagefault_enable() outside?

* on x86 that would allow folding access_ok() with STAC into
user_access_begin(). The same would be doable on other usual suspects
(arm, arm64, ppc, riscv, s390), bringing access_ok() next to their
STAC counterparts.
* pagefault_disable()/pagefault_enable() pair is universal on
all architectures, really meant to by the nature of the beast and
lifting it into kernel/futex.c would get the same situation as with
futex_atomic_cmpxchg_inatomic(). Which also does access_ok() inside
the primitive (also foldable into user_access_begin(), at that).
* access_ok() would be closer to actual memory access (and
out of the generic code).