Re: [PATCH RFC] bpf: Add support for reading user pointers

From: Joel Fernandes
Date: Tue May 07 2019 - 22:01:24 EST


On Tue, May 07, 2019 at 10:52:42AM +0100, Will Deacon wrote:
> On Mon, May 06, 2019 at 10:57:37PM +0100, Qais Yousef wrote:
> > On 05/06/19 16:58, Joel Fernandes wrote:
> > > > If you're trying to dereference a pointer to userspace using
> > > > probe_kernel_read(), that clearly isn't going to work.
> > >
> > > Ok. Thanks for confirming as well. The existing code has this bug and these
> > > patches fix it.
> >
> > 5.1-rc7 and 4.9.173 stable both managed to read the path in do_sys_open() on my
> > Juno-r2 board using the defconfig in the tree.
>
> That's not surprising: Juno-r2 only features v8.0 CPUs, so doesn't have PAN
> or UAO capabilities. The SoC Joel is talking about is 8.2, so has both of
> those.
>
> Here's some background which might help...
>
> PAN (Privileged Access Never) prevents the kernel from inadvertently
> accessing userspace and will cause a page (permission) fault if such an
> access is made outside of the standard uaccess routines. This means that
> in those routines (e.g. get_user()) we have to toggle the PAN state in the
> same way that x86 toggles SMAP. This can be expensive and was part of the
> motivation for the adoption of things like unsafe_get_user() on x86.
>
> On arm64, we have a set of so-called "translated" memory access instructions
> which can be used to perform unprivileged accesses to userspace from within
> the kernel even when PAN is enabled. Using these special instructions (e.g.
> LDTR) in our uaccess routines can therefore remove the need to toggle PAN.
> Sounds great, right? Well, that all falls apart when the uaccess routines
> are used on kernel addresses as in probe_kernel_read() because they will
> fault when trying to dereference a kernel pointer.
>
> The answer is UAO (User Access Override). When UAO is set, the translated
> memory access instructions behave the same as non-translated accesses.
> Therefore we can toggle UAO in set_fs() so that it is set when we're using
> KERNEL_DS and cleared when we're using USER_DS.
>
> The side-effect of this is that when KERNEL_DS is set on a system that
> implements both PAN and UAO, passing a user pointer to our uaccess routines
> will return -EFAULT. In other words, set_fs() can be thought of as a
> selector between the kernel and user address spaces, which are distinct.
>
> Joel -- does disabling UAO in your .config "fix" the issue? If so, feel
> free to use some of the above text in a commit message if it helps to
> justify your change.

Disabling CONFIG_ARM64_UAO does "fix" it.

I will use the description above in the commit message as you suggested,
thanks a lot for the explantation of this!

- Joel