Re: [PATCH] RISC-V: Implement ASID allocator

From: Anup Patel
Date: Wed Mar 27 2019 - 10:10:57 EST


On Wed, Mar 27, 2019 at 7:08 PM Gary Guo <gary@xxxxxxxxxxx> wrote:
>
> On 27/03/2019 11:42, Anup Patel wrote:
> > On Wed, Mar 27, 2019 at 4:57 PM Gary Guo <gary@xxxxxxxxxxx> wrote:
> >>
> >> Hi Anup,
> >>
> >> This won't work in an actual hardware with ASID support. There're more
> >
> > Can you elaborate why >
> >
> > This implementation is based on Linux ARM64 ASID allocator which is
> > tested for large number of CPUs on real HW.
> >
> >> interactions with TLB flushes that need to be considered. You won't see
> >
> > Yap, already considered. Please point me to unhandled case.
> >
> When memory mapping is changed, you need to flush it from all cores that
> previously have that process executed, etc. Our patches both take
> inspiration from ARM's code, but the major difference between my code is
> handling of cache invalidations, see my code's cache_mask, etc. This is
> actually the most error-prone part, and I spent more time trying to find
> an optimal solution for this than porting the ASID allocator. The major
> difference is that ARM has a much more expressive sets of TLB flush
> instructions compared to RISC-V.

We should not require explicit cache maintenance anywhere in RISC-V
because caches are transparent to SW in RISC-V. The HW can use
"sfence.vma" hints for cache maintenance.

Further, (just like ARM world) the page table walks are cache coherent in
RISC-V so we should not require any cache flushes along with TLB flushes.

Now if you seeing inconsistent cache contents then it might due to some
bug in your HW. I am just guessing here.

Regarding changing memory mapping, I am sure generic Linux kernel will
issue appropriate flush_tlb_xyz() calls.

> >
> >> this on both QEMU and SiFive board, as QEMU does not have ASID, so it
> >> pretends that ASID is supported by just flushing its TLB everytime you
> >
> > Nope, it does not. It detects whether ASID is supported or not. If supported
> > it will also figure-out number of ASID bits supported by HW.
> >
> Except that you can detect that QEMU supports ASID, but actually it does
> not. However QEMU is still correct because it always flush TLB when you
> set SATP/SPTBR. You won't be able to find out bugs in your code by just
> testing on QEMU.

I am not advocating that testing on QEMU is sufficient. Its just functionally
correct and works on HW without ASID support.

>
> > SiFive board does not have ASID bits so this implementation successfully
> > detects that number of ASID bits are 0 and fallbacks to original way of
> > local TLB flushes. >
> >> change sptbr. I suspect the performance gain you see is just due to
> >> saved TLB flush as TLB flush is super expensive in QEMU (all translation
> >> block jumps need to be cleared).
> >
> > Yes, performance gain is due to saved TLB flushes.
> >
> > On HW which supports ASID bits, we will see more performance
> > improvements. >
> A hardware TLB flush is cheaper than QEMU' TLB flush. As no hardware
> supports ASID at the moment the performance gain is minimal.
> >>
> >> I have my version here https://github.com/nbdd0121/linux/tree/asid. I
> >> haven't done code cleanups yet, but this version has correctness of its
> >> ASID code tested on our TLB simulator tool (which unfortunately I can't
> >> share right now as it involves with unpublished works).
> >
> > Except few minor differences. You version of ASID allocator is same as
> > mine. In fact there are lot of similar code framgements in your version
> > compared to Linux ARM64 as well. I am sure this patch will work for you.
> > >>
> >> In fact my submit my previous patch series exactly as the basis of this
> >> patch.
> >
> > This patch is based your patch series so I suggest you take this patch
> > and try it on your simulator.
> >
> I've tested, and it does not boot.

Thanks for the info. Now help me make this patch better.

Regards,
Anup