Re: [PATCH] RISC-V: Implement ASID allocator

From: Gary Guo
Date: Wed Mar 27 2019 - 09:39:10 EST

On 27/03/2019 11:42, Anup Patel wrote:
> On Wed, Mar 27, 2019 at 4:57 PM Gary Guo <gary@xxxxxxxxxxx> wrote:
>> Hi Anup,
>> This won't work in an actual hardware with ASID support. There're more
> Can you elaborate why >
> This implementation is based on Linux ARM64 ASID allocator which is
> tested for large number of CPUs on real HW.
>> interactions with TLB flushes that need to be considered. You won't see
> Yap, already considered. Please point me to unhandled case.
When memory mapping is changed, you need to flush it from all cores that
previously have that process executed, etc. Our patches both take
inspiration from ARM's code, but the major difference between my code is
handling of cache invalidations, see my code's cache_mask, etc. This is
actually the most error-prone part, and I spent more time trying to find
an optimal solution for this than porting the ASID allocator. The major
difference is that ARM has a much more expressive sets of TLB flush
instructions compared to RISC-V.
>> this on both QEMU and SiFive board, as QEMU does not have ASID, so it
>> pretends that ASID is supported by just flushing its TLB everytime you
> Nope, it does not. It detects whether ASID is supported or not. If supported
> it will also figure-out number of ASID bits supported by HW.
Except that you can detect that QEMU supports ASID, but actually it does
not. However QEMU is still correct because it always flush TLB when you
set SATP/SPTBR. You won't be able to find out bugs in your code by just
testing on QEMU.

> SiFive board does not have ASID bits so this implementation successfully
> detects that number of ASID bits are 0 and fallbacks to original way of
> local TLB flushes. >
>> change sptbr. I suspect the performance gain you see is just due to
>> saved TLB flush as TLB flush is super expensive in QEMU (all translation
>> block jumps need to be cleared).
> Yes, performance gain is due to saved TLB flushes.
> On HW which supports ASID bits, we will see more performance
> improvements. >
A hardware TLB flush is cheaper than QEMU' TLB flush. As no hardware
supports ASID at the moment the performance gain is minimal.
>> I have my version here I
>> haven't done code cleanups yet, but this version has correctness of its
>> ASID code tested on our TLB simulator tool (which unfortunately I can't
>> share right now as it involves with unpublished works).
> Except few minor differences. You version of ASID allocator is same as
> mine. In fact there are lot of similar code framgements in your version
> compared to Linux ARM64 as well. I am sure this patch will work for you.
> >>
>> In fact my submit my previous patch series exactly as the basis of this
>> patch.
> This patch is based your patch series so I suggest you take this patch
> and try it on your simulator.
I've tested, and it does not boot.