Re: [PATCH v5 0/2] MTE support for KVM guest

From: Marc Zyngier
Date: Mon Dec 07 2020 - 11:06:39 EST


On 2020-12-07 15:45, Steven Price wrote:
On 07/12/2020 15:27, Peter Maydell wrote:
On Mon, 7 Dec 2020 at 14:48, Steven Price <steven.price@xxxxxxx> wrote:
Sounds like you are making good progress - thanks for the update. Have
you thought about how the PROT_MTE mappings might work if QEMU itself
were to use MTE? My worry is that we end up with MTE in a guest
preventing QEMU from using MTE itself (because of the PROT_MTE
mappings). I'm hoping QEMU can wrap its use of guest memory in a
sequence which disables tag checking (something similar will be needed
for the "protected VM" use case anyway), but this isn't something I've
looked into.

It's not entirely the same as the "protected VM" case. For that
the patches currently on list basically special case "this is a
debug access (eg from gdbstub/monitor)" which then either gets
to go via "decrypt guest RAM for debug" or gets failed depending
on whether the VM has a debug-is-ok flag enabled. For an MTE
guest the common case will be guests doing standard DMA operations
to or from guest memory. The ideal API for that from QEMU's
point of view would be "accesses to guest RAM don't do tag
checks, even if tag checks are enabled for accesses QEMU does to
memory it has allocated itself as a normal userspace program".

Sorry, I know I simplified it rather by saying it's similar to
protected VM. Basically as I see it there are three types of memory
access:

1) Debug case - has to go via a special case for decryption or
ignoring the MTE tag value. Hopefully this can be abstracted in the
same way.

2) Migration - for a protected VM there's likely to be a special
method to allow the VMM access to the encrypted memory (AFAIK memory
is usually kept inaccessible to the VMM). For MTE this again has to be
special cased as we actually want both the data and the tag values.

3) Device DMA - for a protected VM it's usual to unencrypt a small
area of memory (with the permission of the guest) and use that as a
bounce buffer. This is possible with MTE: have an area the VMM
purposefully maps with PROT_MTE. The issue is that this has a
performance overhead and we can do better with MTE because it's
trivial for the VMM to disable the protection for any memory.

The part I'm unsure on is how easy it is for QEMU to deal with (3)
without the overhead of bounce buffers. Ideally there'd already be a
wrapper for guest memory accesses and that could just be wrapped with
setting TCO during the access. I suspect the actual situation is more
complex though, and I'm hoping Haibo's investigations will help us
understand this.

What I'd really like to see is a description of how shared memory
is, in general, supposed to work with MTE. My gut feeling is that
it doesn't, and that you need to turn MTE off when sharing memory
(either implicitly or explicitly).

Thanks,

M.
--
Jazz is not dead. It just smells funny...