Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined

From: Russell King - ARM Linux admin
Date: Mon Sep 30 2019 - 16:53:52 EST


On Mon, Sep 30, 2019 at 02:59:25PM +0900, Masahiro Yamada wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
>
> https://lkml.org/lkml/2019/9/26/825
>
> I also received a regression report from Nicolas Saenz Julienne:
>
> https://lkml.org/lkml/2019/9/27/263
>
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
>
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
>
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
>
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
>
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
>
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
>
> Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx>
> Reported-by: Nicolas Saenz Julienne <nsaenzjulienne@xxxxxxx>
> Signed-off-by: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
> ---
>
> arch/arm/include/asm/uaccess.h | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 303248e5b990..559f252d7e3c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
> #define __get_user_check(x, p) \
> ({ \
> unsigned long __limit = current_thread_info()->addr_limit - 1; \
> + unsigned int __ua_flags = uaccess_save_and_enable(); \

If the compiler is moving uaccess_save_and_enable(), that's something
we really don't want - the idea is to _minimise_ the number of kernel
memory accesses between enabling userspace access and performing the
actual access.

Fixing it in this way widens the window for the kernel to be doing
something it shoulding in userspace.

So, the right solution is to ensure that the compiler always inlines
the uaccess_*() helpers - which should be nothing more than four
instructions for uaccess_save_and_enable() and two for the
restore.

I.O.W. it should look something like this:

144: ee134f10 mrc 15, 0, r4, cr3, cr0, {0}
148: e3c4200c bic r2, r4, #12
14c: e24e1001 sub r1, lr, #1
150: e3822004 orr r2, r2, #4
154: ee032f10 mcr 15, 0, r2, cr3, cr0, {0}
158: f57ff06f isb sy
15c: ebfffffe bl 0 <__get_user_4>
160: ee034f10 mcr 15, 0, r4, cr3, cr0, {0}
164: f57ff06f isb sy

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up