Re: [PATCH] riscv: Define TASK_SIZE_MAX for __access_ok()

From: Alexandre Ghiti
Date: Tue Mar 19 2024 - 12:52:18 EST


Hi  Samuel,

On 18/03/2024 22:29, Samuel Holland wrote:
Hi Alex,

On 2024-03-18 3:50 PM, Alexandre Ghiti wrote:
On Wed, Mar 13, 2024 at 7:00 PM Samuel Holland
<samuel.holland@xxxxxxxxxx> wrote:
TASK_SIZE_MAX should be set to the largest userspace address under any
runtime configuration. This optimizes the check in __access_ok(), which
no longer needs to compute the current value of TASK_SIZE. It is still
safe because addresses between TASK_SIZE and TASK_SIZE_MAX are invalid
at the hardware level.

This removes about half of the references to pgtable_l[45]_enabled.

Signed-off-by: Samuel Holland <samuel.holland@xxxxxxxxxx>
---

arch/riscv/include/asm/pgtable-64.h | 1 +
arch/riscv/include/asm/pgtable.h | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
index b99bd66107a6..a677ef3c0fe2 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -17,6 +17,7 @@ extern bool pgtable_l5_enabled;
#define PGDIR_SHIFT_L4 39
#define PGDIR_SHIFT_L5 48
#define PGDIR_SIZE_L3 (_AC(1, UL) << PGDIR_SHIFT_L3)
+#define PGDIR_SIZE_L5 (_AC(1, UL) << PGDIR_SHIFT_L5)

#define PGDIR_SHIFT (pgtable_l5_enabled ? PGDIR_SHIFT_L5 : \
(pgtable_l4_enabled ? PGDIR_SHIFT_L4 : PGDIR_SHIFT_L3))
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 6066822e7396..2032f8ac5fc5 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -867,6 +867,7 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
#ifdef CONFIG_64BIT
#define TASK_SIZE_64 (PGDIR_SIZE * PTRS_PER_PGD / 2)
#define TASK_SIZE_MIN (PGDIR_SIZE_L3 * PTRS_PER_PGD / 2)
+#define TASK_SIZE_MAX (PGDIR_SIZE_L5 * PTRS_PER_PGD / 2)

#ifdef CONFIG_COMPAT
#define TASK_SIZE_32 (_AC(0x80000000, UL))
--
2.43.1

I think you also need to change the check in handle_page_fault() by
using TASK_SIZE_MAX instead of TASK_SIZE, otherwise the fixup can't
happen (https://elixir.bootlin.com/linux/latest/source/arch/riscv/mm/fault.c#L273).
It is not necessary to change that check in fault.c unless we expect to handle
exceptions (outside of userspace access routines) for addresses between
TASK_SIZE and TASK_SIZE_MAX.


Which I think could be the case if some code is only using the "new" access_ok() without the uaccess routines (which is wrong yes, but such code is at the origin of this check).


It looks like the call to fixup_exception() [added
in 416721ff05fd ("riscv, mm: Perform BPF exhandler fixup on page fault")] is
only intended to catch null pointer dereferences. So making the change wouldn't
have any functional impact, but it would still be a valid optimization.

Or I was wondering if it would not be better to do like x86 and use an
alternative, it would be more correct (even though I believe your
solution works)
https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/page_64.h#L82.
What would be the benefit of using an alternative? Any access to an address
between TASK_SIZE and TASK_SIZE_MAX is guaranteed to generate a page fault, so
the only benefit I see is returning -EFAULT slightly faster at the cost of
applying a few hundred alternatives at boot. But it's possible I'm missing
something.


The use of alternatives allows to return right away if the buffer is beyond the usable user address space, and it's not just "slightly faster" for some cases (a very large buffer with only a few bytes being beyond the limit or someone could fault-in all the user pages and fail very late...etc). access_ok() is here to guarantee that such situations don't happen, so actually it makes more sense to use an alternative to avoid that.


Alex


Regards,
Samuel


_______________________________________________
linux-riscv mailing list
linux-riscv@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-riscv