Re: [PATCH v5 10/11] powerpc/pseries/iommu: Make use of DDW for indirect mapping

From: Leonardo Brás
Date: Tue Aug 17 2021 - 02:00:05 EST


Alexey, Fred:

On Fri, 2021-07-23 at 15:34 +1000, Alexey Kardashevskiy wrote:
>
>
> On 22/07/2021 01:04, Frederic Barrat wrote:
> >
> >
> > On 21/07/2021 05:32, Alexey Kardashevskiy wrote:
> > > > > +        struct iommu_table *newtbl;
> > > > > +        int i;
> > > > > +
> > > > > +        for (i = 0; i < ARRAY_SIZE(pci->phb->mem_resources);
> > > > > i++) {
> > > > > +            const unsigned long mask = IORESOURCE_MEM_64 |
> > > > > IORESOURCE_MEM;
> > > > > +
> > > > > +            /* Look for MMIO32 */
> > > > > +            if ((pci->phb->mem_resources[i].flags & mask) ==
> > > > > IORESOURCE_MEM)
> > > > > +                break;
> > > > > +        }
> > > > > +
> > > > > +        if (i == ARRAY_SIZE(pci->phb->mem_resources))
> > > > > +            goto out_del_list;
> > > >
> > > >
> > > > So we exit and do nothing if there's no MMIO32 bar?
> > > > Isn't the intent just to figure out the MMIO32 area to reserve
> > > > it
> > > > when init'ing the table? In which case we could default to 0,0
> > > >
> > > > I'm actually not clear why we are reserving this area on
> > > > pseries.
> > >
> > >
> > >
> > > If we do not reserve it, then the iommu code will allocate DMA
> > > pages
> > > from there and these addresses are MMIO32 from the kernel pov at
> > > least. I saw crashes when (I think) a device tried DMAing to the
> > > top
> > > 2GB of the bus space which happened to be a some other device's
> > > BAR.
> >
> >
> > hmmm... then figuring out the correct range needs more work. We
> > could
> > have more than one MMIO32 bar. And they don't have to be adjacent.
>
> They all have to be within the MMIO32 window of a PHB and we reserve
> the
> entire window here.
>
> > I
> > don't see that we are reserving any range on the initial table
> > though
> > (on pseries).
> True, we did not need to, as the hypervisor always took care of DMA
> and
> MMIO32 regions to not overlap.
>
> And in this series we do not (strictly speaking) need this either as
> phyp never allocates more than one window dynamically and that only
> window is always the second one starting from 0x800.0000.0000.0000.
> It
> is probably my mistake that KVM allows a new window to start from 0 -
> PAPR did not prohibit this explicitly.
>
> And for the KVM case, we do not need to remove the default window as
> KVM
> can pretty much always allocate as many TCE as the VM wants. But we
> still allow removing the default window and creating a huge one
> instead
> at 0x0 as this way we can allow 1:1 for every single PCI device even
> if
> it only allows 48 (or similar but less than 64bit) DMA. Hope this
> makes
> sense. Thanks,
>
>

Thank you for this discussion, I got to learn a lot!

If I got this, no further change will be necessary, is that correct?

I am testing a v6, and I intend to send it soon.