Re: [PATCH] riscv: Invalid instruction cache after copy the xol area
From: Palmer Dabbelt
Date: Thu Jun 16 2022 - 18:05:20 EST
On Wed, 18 May 2022 01:17:53 PDT (-0700), po-kai.chi@xxxxxxxxxx wrote:
We need to invalid the relevant instruction cache after
copying the xol area, to ensure the changes takes effect.
Signed-off-by: Po-Kai Chi <po-kai.chi@xxxxxxxxxx>
---
arch/riscv/kernel/probes/uprobes.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
index 7a057b5f0adc..9d52beeac73c 100644
--- a/arch/riscv/kernel/probes/uprobes.c
+++ b/arch/riscv/kernel/probes/uprobes.c
@@ -165,6 +165,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
/* Initialize the slot */
void *kaddr = kmap_atomic(page);
void *dst = kaddr + (vaddr & ~PAGE_MASK);
+ unsigned long addr = (unsigned long)dst;
memcpy(dst, src, len);
@@ -177,10 +178,9 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
kunmap_atomic(kaddr);
/*
- * We probably need flush_icache_user_page() but it needs vma.
- * This should work on most of architectures by default. If
- * architecture needs to do something different it can define
- * its own version of the function.
+ * Flush both I/D cache to ensure instruction modification
+ * takes effect.
*/
flush_dcache_page(page);
+ flush_icache_range(addr, addr + len);
}
This brings up a handful of issues:
* This always inserts a 32-bit breakpoint, but that's not quite correct.
This should really be checking the _next_ instruction as well to
insert a 16-bit breakpoint if it's a 16-bit instruction as otherwise
userspace might jump into the middle of the breakpoint.
* These instructions can be concurrently executing, which relies on some
instruction fetch ordering that's very lightly specified. We don't
rely on that elsewhere (see stop_machine() in kprobes), but we
probably should. It's probably worth adding something to probe the HW
to make sure this is supported.
* Adding the icache flush defeats a uprobes advantage in that we'll now
be triggering remote execution (to do the remote fence.i). One option
could be to defer the fence and wait on it, but not sure if that's
sane and it'd likely require a lot of
This also leaves a bit undefined WRT icache aliasing, at least in terms
of the API used. IMO it'd be
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c
index 9d52beeac73c..c857346864fc 100644
--- a/arch/riscv/kernel/probes/uprobes.c
+++ b/arch/riscv/kernel/probes/uprobes.c
@@ -165,7 +165,6 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
/* Initialize the slot */
void *kaddr = kmap_atomic(page);
void *dst = kaddr + (vaddr & ~PAGE_MASK);
- unsigned long addr = (unsigned long)dst;
memcpy(dst, src, len);
@@ -179,8 +178,10 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
/*
* Flush both I/D cache to ensure instruction modification
- * takes effect.
+ * takes effect. We don't need to flush the whole icache, but that's
+ * all RISC-V defines so rather than worry about aliasing this just
+ * flushes everything.
*/
flush_dcache_page(page);
- flush_icache_range(addr, addr + len);
+ flush_icache_all();
}
which will end up doing the same thing but avoids the ambiguity. I went
ahead and put this at palmer/riscv-uprobe_fencei with that and some
other minor things fixed up, LMK if that looks OK and I'll take it on
fixes.
Thanks!