Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory

From: Will Deacon
Date: Fri Oct 13 2023 - 05:30:08 EST


On Thu, Oct 12, 2023 at 03:36:24PM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 12, 2023 at 05:39:31PM +0100, Will Deacon wrote:
>
> > All I'm asking for is justification as to why Normal-NC is the right
> > memory type rather than any other normal memory type. If it's not possible
> > to explain that architecturally, then I'm not sure this change belongs in
> > architecture code.
>
> Well, I think Catalin summarized it nicely, I second his ask at the end:
>
> We are basically at your scenario below - can you justify why
> DEVICE_GRE is correct today, architecturally? We could not. Earlier
> someone said uncontained failure prevention, but a deep dive on that
> found it is not so.

This logic can be used to justify the use of any other memory type. I'm
asking specifically why Normal-NC is correct.

> > Ultimately, we need to be able to maintain this stuff, so we can't just
> > blindly implement changes based on a combination of off-list discussions
> > and individual product needs. For example, if somebody else rocks up
> > tomorrow and asks for this to be Normal-writethrough, what grounds do
> > we have to say no if we've taken this change already?
>
> Hmm. Something got lost here.

Well, I didn't assume FWB since these patches change the behaviour
regardless...

> This patch is talking about the S2 MemAttr[3:0]. There are only 4
> relevant values (when FEAT_S2FWB) (see D5.5.5 in ARM DDI 0487F.c)
>
> 0b001 - Today: force VM to be Device nGnRE
>
> 0b101 - Proposed: prevent the VM from selecting cachable, allow it to
> choose Device-* or NormalNC
>
> 0b110 - Force write back. Would totally break MMIO, ruled out
>
> 0b111 - Allow the VM to select anything, including cachable.
> This is nice, but summarizing Catalin's remarks:
> a) The kernel can't do cache maintenance (defeats FWB)
> b) Catalin's concerns about MTE and Atomics triggering
> uncontained failures
> c) It is unclear about uncontained failures for cachable
> in general
>
> So the only argument is 001 v 110 v 111

Ok, so I can see why you end up with Normal-NC on a system that implements
FWB. Systems without FWB have other options, but you can reasonably argue
that having consistent behaviour between the two configurations is
desirable. Please can you add that to the commit message?

I still don't understand how to reclaim an MMIO region that the guest has
mapped Normal-NC (see the thread with Catalin).

Will