Re: [PATCH v2] riscv: Add tracepoints for SBI calls and returns

From: Alexandre Ghiti
Date: Fri Aug 16 2024 - 13:30:15 EST


On 13/08/2024 22:26, Conor Dooley wrote:
On Thu, Mar 21, 2024 at 04:01:25PM -0700, Samuel Holland wrote:
These are useful for measuring the latency of SBI calls. The SBI HSM
extension is excluded because those functions are called from contexts
such as cpuidle where instrumentation is not allowed.

Reviewed-by: Andrew Jones <ajones@xxxxxxxxxxxxxxxx>
Signed-off-by: Samuel Holland <samuel.holland@xxxxxxxxxx>
I seem to have bisected a boot failure on v6.11-rc# to this commit,
with a crash before the kernel ever gets to output anything - even on
the bootconsole. For one of these crashes I got the below (which is
actually U-Boots exception handler), but most of the time - nothing at
all. I'll try to decodecode that tomorrow and fish out a config, because
this is either config or toolchain dependant (I saw it in work, with a
slightly different config etc to what I have here).


Starting kernel ...
Unhandled exception: Load access fault
EPC: 000000008020e09c RA: 000000008020e510 TVAL: ffffffff81cbf1f0
EPC: 00000000408a009c RA: 00000000408a0510 reloc adjusted
Code: 3597 01a4 b583 1f45 5613 0035 9a61 95b2 (618c)
resetting ...
reset not supported yet
### ERROR ### Please RESET the board ###


Cheers,
Conor.


So I have just hit a similar issue while debugging the syzkaller failure. The problem is visible with KASAN, TRACEPOINTS and RISCV_ALTERNATIVE_EARLY.

The latter is important since this is where the very early sbi call is made in riscv_fill_cpu_mfr_info() (https://elixir.bootlin.com/linux/v6.11-rc3/source/arch/riscv/kernel/alternative.c#L32).

And since kernel/sbi.c is compiled with KASAN instrumentation, it fails to access the KASAN region as it is not yet set up (the MMU is not up yet). So the following fixes the issue:

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 06d407f1b30b..3b926c9444d0 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -31,6 +31,7 @@ endif
 ifdef CONFIG_KASAN
 KASAN_SANITIZE_alternative.o := n
 KASAN_SANITIZE_cpufeature.o := n
+KASAN_SANITIZE_sbi.o := n
 endif
 endif

But I don't like this solution as it removes instrumentation everywhere in this file. I'd rather move the __sbi_ecall() into its own file and remove the instrumentation here.

@Conor Can you give the diff above a try and see if it fixes your issue?

Thanks,

Alex



_______________________________________________
linux-riscv mailing list
linux-riscv@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-riscv