Re: [PATCH 0/4] riscv: tlb flush improvements

From: Palmer Dabbelt
Date: Wed Jul 12 2023 - 13:24:01 EST


On Wed, 12 Jul 2023 10:19:47 PDT (-0700), Conor Dooley wrote:
On Wed, Jul 12, 2023 at 05:18:00PM +0200, Alexandre Ghiti wrote:
On 12/07/2023 09:08, Conor Dooley wrote:
> On Tue, Jul 11, 2023 at 09:54:30AM +0200, Alexandre Ghiti wrote:
> > This series optimizes the tlb flushes on riscv which used to simply
> > flush the whole tlb whatever the size of the range to flush or the size
> > of the stride.
> > > > Patch 3 introduces a threshold that is microarchitecture specific and
> > will very likely be modified by vendors, not sure though which mechanism
> > we'll use to do that (dt? alternatives? vendor initialization code?).


@Conor any idea how to achieve this?

It's in my queue of things to look at, just been prioritising the
extension related stuff the last few days. Hopefully I'll have a chance
to think about this tomorrow.. Famous last words probably.

> > Next steps would be to implement:
> > - svinval extension as Mayuresh did here [1]
> > - BATCHED_UNMAP_TLB_FLUSH (I'll wait for arm64 patchset to land)
> > - MMU_GATHER_RCU_TABLE_FREE
> > - MMU_GATHER_MERGE_VMAS
> > > > Any other idea welcome.
> > > > [1] https://lore.kernel.org/linux-riscv/20230623123849.1425805-1-mchitale@xxxxxxxxxxxxxxxx/
> > > > Alexandre Ghiti (4):
> > riscv: Improve flush_tlb()
> > riscv: Improve flush_tlb_range() for hugetlb pages
> > riscv: Make __flush_tlb_range() loop over pte instead of flushing the
> > whole tlb
> The whole series does not build on nommu & this one adds a build warning
> for regular builds:
> + 1 ../arch/riscv/mm/tlbflush.c:32:15: warning: symbol 'tlb_flush_all_threshold' was not declared. Should it be static?
> > Cheers,
> Conor.


I'll fix the nommu build, sorry about that. Weird I missed this warning,
that's an LLVM build right? That variable will need to overwritten by the
vendors, so that should not be static (but it will depend on what solution
we implement).

Just make it static until we actually have a vendor implementation of
this stuff please, since we don't know what that will look like yet.

It's just a performance issue, right? IIRC the SiFive errata wasn't actually based on how many TLB flushes happen, they're just broken in general so it was a probability thing.

If that's the case I agree we can just start with something arbitrary to start and then figure out how to set the tunable later. It's probably going to be workload-specific too, so we'll probably end up with both a firmware default and a userspace override (maybe a sys entry or whatever).