Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation APIs

From: Jason Gunthorpe
Date: Wed Apr 28 2021 - 20:21:58 EST


On Wed, Apr 28, 2021 at 11:23:39AM +1000, David Gibson wrote:

> Yes. My proposed model for a unified interface would be that when you
> create a new container/IOASID, *no* IOVAs are valid.

Hurm, it is quite tricky. All IOMMUs seem to have a dead zone around
the MSI window, so negotiating this all in a general way is not going
to be a very simple API.

To be general it would be nicer to say something like 'I need XXGB of
IOVA space' 'I need 32 bit IOVA space' etc and have the kernel return
ranges that sum up to at least that big. Then the kernel can do its
all its optimizations.

I guess you are going to say that the qemu PPC vIOMMU driver needs
more exact control..

> I expect we'd need some kind of query operation to expose limitations
> on the number of windows, addresses for them, available pagesizes etc.

Is page size an assumption that hugetlbfs will always be used for backing
memory or something?

> > As an ideal, only things like the HW specific qemu vIOMMU driver
> > should be reaching for all the special stuff.
>
> I'm hoping we can even avoid that, usually. With the explicitly
> created windows model I propose above, it should be able to: qemu will
> create the windows according to the IOVA windows the guest platform
> expects to see and they either will or won't work on the host platform
> IOMMU. If they do, generic maps/unmaps should be sufficient. If they
> don't well, the host IOMMU simply cannot emulate the vIOMMU so you're
> out of luck anyway.

It is not just P9 that has special stuff, and this whole area of PASID
seems to be quite different on every platform

If things fit very naturally and generally then maybe, but I've been
down this road before of trying to make a general description of a
group of very special HW. It ended in tears after 10 years when nobody
could understand the "general" API after it was Frankenstein'd up with
special cases for everything. Cautionary tale

There is a certain appeal to having some
'PPC_TCE_CREATE_SPECIAL_IOASID' entry point that has a wack of extra
information like windows that can be optionally called by the viommu
driver and it remains well defined and described.

Jason