Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory

From: Jason Gunthorpe
Date: Thu Oct 12 2023 - 11:44:50 EST


On Thu, Oct 12, 2023 at 03:48:08PM +0100, Will Deacon wrote:

> I guess my wider point is that I'm not convinced that non-cacheable is
> actually much better and I think we're going way off the deep end looking
> at what particular implementations do and trying to justify to ourselves
> that non-cacheable is safe, even though it's still a normal memory type
> at the end of the day.

When we went over this with ARM it became fairly clear there wasn't an
official statement that Device-* is safe from uncontained
failures. For instance, looking at the actual IP, our architects
pointed out that ARM IP already provides ways for Device-* to trigger
uncontained failures today.

We then mutually concluded that KVM safe implementations must already
be preventing uncontained failures for Device-* at the system level
and that same prevention will carry over to NormalNC as well.

IMHO, this seems to be a gap where ARM has not fully defined when
uncontained failures are allowed and left that as an implementation
choice.

In other words, KVM safety around uncontained failure is not a
property that can be reasoned about from the ARM architecture alone.

> The current wording talks about use-cases (I get this) and error containment
> (it's a property of the system) but doesn't talk at all about why Normal-NC
> is the right result.

Given that Device-* and NormalNC are equally implementation defined
with regards to uncontained failures, NormalNC allows more VM
functionality.

Further, we have a broad agreement that this use case is important,
and that NormalNC is the correct way to adress it.

I think you are right to ask for more formality from ARM team but also
we shouldn't hold up fixing real functional bugs in real shipping
server ARM products.

Jason