Re: [PATCH 2/2] PCI: vmd: Fix shadow offsets to reflect spec changes

From: Derrick, Jonathan
Date: Tue Sep 17 2019 - 11:51:43 EST


On Tue, 2019-09-17 at 16:15 +0100, Lorenzo Pieralisi wrote:
> On Tue, Sep 17, 2019 at 02:45:03PM +0000, Derrick, Jonathan wrote:
> > On Tue, 2019-09-17 at 15:05 +0100, Lorenzo Pieralisi wrote:
> > > On Tue, Sep 17, 2019 at 01:55:59PM +0000, Derrick, Jonathan wrote:
> > > > On Tue, 2019-09-17 at 11:41 +0100, Lorenzo Pieralisi wrote:
> > > > > On Mon, Sep 16, 2019 at 07:54:35AM -0600, Jon Derrick wrote:
> > > > > > The shadow offset scratchpad was moved to 0x2000-0x2010. Update the
> > > > > > location to get the correct shadow offset.
> > > > >
> > > > > Hi Jon,
> > > > >
> > > > > what does "was moved" mean ? Would this code still work on previous HW ?
> > > > >
> > > > The previous code won't work on (not yet released) hw. Guests using the
> > > > domain will see the wrong offset and enumerate the domain incorrectly.
> > >
> > > That's true also for new kernels on _current_ hardware, isn't it ?
> > >
> > > What I am saying is that there must be a way to detect the right
> > > offset from HW probing or firmware otherwise things will break
> > > one way of another.
> > >
> > I think this is basically that, but the spec changed which register
> > addresses contained the offset. The offset was still discoverable
> > either way, but is now within 0x2000-0x2010, with 0x2010-0x2090 as oob
> > interface.
> >
> >
> >
> > > > > We must make sure that the address move is managed seamlessly by the
> > > > > kernel.
> > > > If we need to avoid changing addressing within stable, then that's
> > > > fine. But otherwise is there an issue merging with 5.4?
> > >
> > > See above. Would 5.4 with this patch applied work on systems where
> > > the offset is the same as the _current_ one without this patch
> > > applied ?
> > I understand your concern, but these systems with wrong addressing
> > won't exist because the hardware isn't released yet.
> >
> > In the future, the hardware will be released and users will inevitably
> > load some unfixed kernel, and we would like to point to stable for the
> > fix.
>
> I am sorry for being blunt but I need to understand. If we apply
> this patch, are you telling me that the _current_ HW would fail ?
>
> I assume the current HW+kernel set-up is working, maybe that's
> what I got wrong.
>
> Reworded: on existing HW, is this patch fixing anything ?
>
> This patch when it hits the mainline will trickle into stable
> kernel unchanged.
Sorry for the confusion.

These changes only affect systems with VMD devices with 8086:28C0
device IDs, but these won't be production hardware for some time.

Systems with VMD devices exist in the wild with 8086:201D device IDs.
These don't support the guest passthrough mode and this code won't
break anything with them. Additionally, patch 1/2 (bus numbering) only
affects 8086:28C0.

So on existing HW, these patches won't affect anything



>
> > > > > For the time being I have to drop this fix and it is extremely
> > > > > tight to get it in v5.4, I can't send stable changes that we may
> > > > > have to revert.
> > > > Aren't we in the beginning of the merge window?
> > >
> > > Yes and that's the problem, patches for v5.4 should have already
> > > being queued a while ago, I do not know when Bjorn will send the
> > > PR for v5.4 but that's going to happen shortly, I am making an
> > > exception to squeeze these patches in since it is vmd only code
> > > but still it is very very tight.
> > >
> > If you feel there's a risk, then I think it can be staged for v5.5.
> > Hardware will not be available for some time.
>
> I do not feel it is risky, I feel it would be much better if the
> scratchpad address could be detected at runtime through versioning
> of sorts either HW or firmware based.
>
> If we can't probe it inevitably we will have systems where kernels
> would break and that's something we should avoid.
>
I agree that it might have been nicer if it were an ACPI/EFI var, but I
think there were some complexities with teaching hypervisors to expose
it to the guests for use when enumerating the domain from the passed-
through endpoint. The method that exists in 8086:28C0 hardware divorces
the firmware descriptors from the device so that the guest driver only
needs to read the host-to-guest physical offset from the device itself.


Best regards,
Jon


> Lorenzo