Re: [PATCH v8] riscv: mm: Add support for Svinval extension

From: Palmer Dabbelt
Date: Wed Jul 24 2024 - 10:53:14 EST


On Tue, 02 Jul 2024 03:26:37 PDT (-0700), mchitale@xxxxxxxxxxxxxxxx wrote:
The Svinval extension splits SFENCE.VMA instruction into finer-grained
invalidation and ordering operations and is mandatory for RVA23S64 profile.
When Svinval is enabled the local_flush_tlb_range_threshold_asid function
should use the following sequence to optimize the tlb flushes instead of

Do you have any performance numbers for the optimization? As per here <https://lore.kernel.org/all/mhng-f799bd2b-7f22-4c03-bdb2-903fa3b5d508@palmer-ri-x1c9a/>.

a simple sfence.vma:

sfence.w.inval
svinval.vma
.
.
svinval.vma
sfence.inval.ir

The maximum number of consecutive svinval.vma instructions that
can be executed in local_flush_tlb_range_threshold_asid function
is limited to 64. This is required to avoid soft lockups and the
approach is similar to that used in arm64.

Signed-off-by: Mayuresh Chitale <mchitale@xxxxxxxxxxxxxxxx>
Reviewed-by: Andrew Jones <ajones@xxxxxxxxxxxxxxxx>
---
Changes in v8:
- Fix line wrap
- Add RB tag

Changes in v7:
- Use existing svinval macros in the insn-def.h
- Rename local_sinval_vma_asid to local_sinval_vma

Changes in v6:
- Rebase on latest torvalds/master

Changes in v5:
- Reduce tlb flush threshold to 64
- Improve implementation of local_flush_tlb* functions

Changes in v4:
- Rebase and refactor as per latest changes on torvalds/master
- Drop patch 1 in the series

Changes in v3:
- Fix incorrect vma used for sinval instructions
- Use unified static key mechanism for svinval
- Rebased on torvalds/master

Changes in v2:
- Rebased on 5.18-rc3
- update riscv_fill_hwcap to probe Svinval extension

arch/riscv/mm/tlbflush.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c
index 9b6e86ce3867..782147a63f3b 100644
--- a/arch/riscv/mm/tlbflush.c
+++ b/arch/riscv/mm/tlbflush.c
@@ -6,6 +6,27 @@
#include <linux/hugetlb.h>
#include <asm/sbi.h>
#include <asm/mmu_context.h>
+#include <asm/cpufeature.h>
+
+#define has_svinval() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVINVAL)
+
+static inline void local_sfence_inval_ir(void)
+{
+ asm volatile(SFENCE_INVAL_IR() ::: "memory");
+}
+
+static inline void local_sfence_w_inval(void)
+{
+ asm volatile(SFENCE_W_INVAL() ::: "memory");
+}
+
+static inline void local_sinval_vma(unsigned long vma, unsigned long asid)
+{
+ if (asid != FLUSH_TLB_NO_ASID)
+ asm volatile(SINVAL_VMA(%0, %1) : : "r" (vma), "r" (asid) : "memory");
+ else
+ asm volatile(SINVAL_VMA(%0, zero) : : "r" (vma) : "memory");
+}

/*
* Flush entire TLB if number of entries to be flushed is greater
@@ -26,6 +47,16 @@ static void local_flush_tlb_range_threshold_asid(unsigned long start,
return;
}

+ if (has_svinval()) {
+ local_sfence_w_inval();
+ for (i = 0; i < nr_ptes_in_range; ++i) {
+ local_sinval_vma(start, asid);
+ start += stride;
+ }
+ local_sfence_inval_ir();
+ return;
+ }
+
for (i = 0; i < nr_ptes_in_range; ++i) {
local_flush_tlb_page_asid(start, asid);
start += stride;