Re: [PATCH v2 0/2] MTE support for KVM guest

From: Steven Price
Date: Thu Sep 10 2020 - 06:25:37 EST


On 10/09/2020 01:33, Richard Henderson wrote:
On 9/4/20 9:00 AM, Steven Price wrote:
3. Doesn't provide any new methods for the VMM to access the tags on
memory.
...
(3) may be problematic and I'd welcome input from those familiar with
VMMs. User space cannot access tags unless the memory is mapped with the
PROT_MTE flag. However enabling PROT_MTE will also enable tag checking
for the user space process (assuming the VMM enables tag checking for
the process)...

The latest version of the kernel patches for user mte support has separate
controls for how tag check fail is reported. Including

+- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults

That may be less than optimal once userland starts uses tags itself, e.g.
running qemu itself with an mte-aware malloc.

Independent of that, there's also the TCO bit, which can be toggled by any
piece of code that wants to disable checking locally.

Yes, I would expect the TCO bit is the best option for wrapping accesses to make them unchecked.

However, none of that is required for accessing tags. User space can always
load/store tags via LDG/STG. That's going to be slow, though.

Yes as things stand LDG/STG is the way for user space to access tags. Since I don't have any real hardware I can't really comment on speed.

It's a shame that LDGM/STGM are privileged instructions. I don't understand
why that was done, since there's absolutely nothing that those insns can do
that you can't do with (up to) 16x LDG/STG.

It is a shame, however I suspect this is because to use those instructions you need to know the block size held in GMID_EL1. And at least in theory that could vary between CPUs.

I think it might be worth adding some sort of kernel entry point that can bulk
copy tags, e.g. page aligned quantities. But that's just a speed of migration
thing and could come later.

When we have some real hardware it would be worth profiling this. At the moment I've no idea whether the kernel entry overhead would make such an interface useful from a performance perspective or not.

Steve