Re: [PATCH 1/2] iommu: Prevent RESV_DIRECT devices from blocking domains
From: Jason Gunthorpe
Date: Tue Jun 27 2023 - 11:49:46 EST
On Tue, Jun 27, 2023 at 08:10:48AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Monday, June 19, 2023 11:30 PM
> >
> > On Mon, Jun 19, 2023 at 03:20:30PM +0100, Robin Murphy wrote:
> >
> > > so if the only time they're used is when the IOMMU driver has
> > > already had a catastrophic internal failure such that we decide to
> > > declare the device toasted and deliberately put it into an unusable
> > > state, blocking its reserved regions doesn't seem like a big deal.
> >
> > I think we should discuss then when we get to actually implementing
> > the error recovery flow we want. I do like blocking in general for the
> > reasons you give, and that was my first plan.. But if the BIOS will
> > crash or something if we don't do the reserved region maps that isn't
> > so good either. IDK would like to hear from the people using this BIOS
> > feature.
> >
>
> The only devices with RMRR which I'm aware of on Intel platforms are
> GPU and USB. However they are all RESV_DIRECT_RELAXABLE type.
>
> Here is one reference from the Xen hypervisor. It has a concept called
> quarantine domain (similar to blocking domain) when a device is
> de-assigned w/o an owner. The quarantine domain has no mappings
> except the ones identity-mapped for RMRR types. I'm not sure whether
> they observed real examples of RMRR devices which are not GPU/USB.
I guess, it seems a bit hard core, but we could spend the cost to
build such a domain during probe..
At least for our cases we already do expect that DMA is halted before
we start messing with the domains so identity may not be the same
issue as with Xen..
Jason