Re: [PATCH v5 0/2] MTE support for KVM guest

From: Marc Zyngier
Date: Tue Dec 08 2020 - 13:22:21 EST


On 2020-12-08 17:21, Catalin Marinas wrote:
On Mon, Dec 07, 2020 at 07:03:13PM +0000, Marc Zyngier wrote:
On Mon, 07 Dec 2020 16:34:05 +0000,
Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> On Mon, Dec 07, 2020 at 04:05:55PM +0000, Marc Zyngier wrote:
> > What I'd really like to see is a description of how shared memory
> > is, in general, supposed to work with MTE. My gut feeling is that
> > it doesn't, and that you need to turn MTE off when sharing memory
> > (either implicitly or explicitly).
>
> The allocation tag (in-memory tag) is a property assigned to a physical
> address range and it can be safely shared between different processes as
> long as they access it via pointers with the same allocation tag (bits
> 59:56). The kernel enables such tagged shared memory for user processes
> (anonymous, tmpfs, shmem).

I think that's one case where the shared memory scheme breaks, as we
have two kernels in charge of their own tags, and they obviously can't
be synchronised

Yes, if you can't trust the other entity to not change the tags, the
only option is to do an untagged access.

> What we don't have in the architecture is a memory type which allows
> access to tags but no tag checking. To access the data when the tags
> aren't known, the tag checking would have to be disabled via either a
> prctl() or by setting the PSTATE.TCO bit.

I guess that's point (3) in Steven's taxonomy. It still a bit ugly to
fit in an existing piece of userspace, specially if it wants to use
MTE for its own benefit.

I agree it's ugly. For the device DMA emulation case, the only sane way
is to mimic what a real device does - no tag checking. For a generic
implementation, this means that such shared memory should not be mapped
with PROT_MTE on the VMM side. I guess this leads to your point that
sharing doesn't work for this scenario ;).

Exactly ;-)

> The kernel accesses the user memory via the linear map using a match-all
> tag 0xf, so no TCO bit toggling. For user, however, we disabled such
> match-all tag and it cannot be enabled at run-time (at least not easily,
> it's cached in the TLB). However, we already have two modes to disable
> tag checking which Qemu could use when migrating data+tags.

I wonder whether we will have to have something kernel side to
dump/reload tags in a way that matches the patterns used by live
migration.

We have something related - ptrace dumps/resores the tags. Can the same
concept be expanded to a KVM ioctl?

Yes, although I wonder whether we should integrate this deeply into
the dirty-log mechanism: it would be really interesting to dump the
tags at the point where the page is flagged as clean from a dirty-log
point of view. As the page is dirtied, discard the saved tags.

It is probably expensive, but it ensures that the VMM sees consistent
tags (if the page is clean, the tags are valid). Of course, it comes
with the added requirement that the VMM allocates enough memory to
store the tags, which may be a tall order. I'm not sure how to
give a consistent view to userspace otherwise.

It'd be worth looking at how much we can reuse from the ptrace (and
I expect swap?) code to implement this.

Thanks,

M.
--
Jazz is not dead. It just smells funny...