Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
From: Christophe Leroy
Date: Mon Sep 19 2022 - 09:39:11 EST
Le 19/09/2022 à 14:37, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task
>> magically. Task switch will always happen in an interrupt, that means
>> copy_{from,to}_user() get interrupted.
>
> Unfortunately this isn't true when CONFIG_PREEMPT=y.
Argh, yes, I wrote the above with the assumption that we properly follow
the main principles that no complex fonction is to be used while KUAP is
open ... Which is apparently not true here. x86 would have detected it
with objtool, but we don't have it yet in powerpc.
>
> We can switch away without an interrupt via:
> __copy_tofrom_user()
> -> __copy_tofrom_user_power7()
> -> exit_vmx_usercopy()
> -> preempt_enable()
> -> __preempt_schedule()
> -> preempt_schedule()
> -> preempt_schedule_common()
> -> __schedule()
Should we use preempt_enable_no_resched() to avoid that ?
>
> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
> all on Power8, which is a bit of an oversight on my part.
>
> And clearly no one else tests it, until now :)
>
> I think the root of our problem is that our KUAP lock/unlock is at too
> high a level, ie. we do it in C around the low-level copy to/from.
>
> eg:
>
> static inline unsigned long
> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
> {
> unsigned long ret;
>
> allow_write_to_user(to, n);
> ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
> prevent_write_to_user(to, n);
> return ret;
> }
>
> There's a reason we did that, which is that we have various different
> KUAP methods on different platforms, not a simple instruction like other
> arches.
>
> But that means we have that exit_vmx_usercopy() being called deep in the
> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
> the preempt machinery and eventually schedule.
>
> I don't see an easy way to fix that "properly", it would be a big change
> to all platforms to push the KUAP save/restore down into the low level
> asm code.
>
> But I think the patch below does fix it, although it abuses things a
> little. Namely it only works because the 64s KUAP code can handle a
> double call to prevent, and doesn't need the addresses or size for the
> allow.
>
> Still I think it might be our best option for an easy fix.
Wouldn't it be even easier and less abusive to use
preemt_enable_no_resched() ? Or is there definitively a good reason to
resched after a VMX copy while we don't with regular copies ?
Christophe