Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

From: Alex Williamson
Date: Fri Mar 18 2022 - 11:26:03 EST


On Fri, 18 Mar 2022 11:06:00 -0400
Alex Deucher <alexdeucher@xxxxxxxxx> wrote:

> On Fri, Mar 18, 2022 at 10:46 AM Alex Williamson
> <alex.williamson@xxxxxxxxxx> wrote:
> >
> > On Fri, 18 Mar 2022 08:01:31 +0100
> > Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> wrote:
> >
> > > On 18.03.22 06:43, Paul Menzel wrote:
> > > >
> > > > Am 17.03.22 um 13:54 schrieb Thorsten Leemhuis:
> > > >> On 13.03.22 19:33, James Turner wrote:
> > > >>>
> > > >>>> My understanding at this point is that the root problem is probably
> > > >>>> not in the Linux kernel but rather something else (e.g. the machine
> > > >>>> firmware or AMD Windows driver) and that the change in f9b7f3703ff9
> > > >>>> ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") simply
> > > >>>> exposed the underlying problem.
> > > >>
> > > >> FWIW: that in the end is irrelevant when it comes to the Linux kernel's
> > > >> 'no regressions' rule. For details see:
> > > >>
> > > >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/admin-guide/reporting-regressions.rst
> > > >>
> > > >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/process/handling-regressions.rst
> > > >>
> > > >>
> > > >> That being said: sometimes for the greater good it's better to not
> > > >> insist on that. And I guess that might be the case here.
> > > >
> > > > But who decides that?
> > >
> > > In the end afaics: Linus. But he can't watch each and every discussion,
> > > so it partly falls down to people discussing a regression, as they can
> > > always decide to get him involved in case they are unhappy with how a
> > > regression is handled. That obviously includes me in this case. I simply
> > > use my best judgement in such situations. I'm still undecided if that
> > > path is appropriate here, that's why I wrote above to see what James
> > > would say, as he afaics was the only one that reported this regression.
> > >
> > > > Running stuff in a virtual machine is not that uncommon.
> > >
> > > No, it's about passing through a GPU to a VM, which is a lot less common
> > > -- and afaics an area where blacklisting GPUs on the host to pass them
> > > through is not uncommon (a quick internet search confirmed that, but I
> > > might be wrong there).
> >
> > Right, interference from host drivers and pre-boot environments is
> > always a concern with GPU assignment in particular. AMD GPUs have a
> > long history of poor behavior relative to things like PCI secondary bus
> > resets which we use to try to get devices to clean, reusable states for
> > assignment. Here a device is being bound to a host driver that
> > initiates some sort of power control, unbound from that driver and
> > exposed to new drivers far beyond the scope of the kernel's regression
> > policy. Perhaps it's possible to undo such power control when
> > unbinding the device, but it's not necessarily a given that such a
> > thing is possible for this device without a cold reset.
> >
> > IMO, it's not fair to restrict the kernel from such advancements. If
> > the use case is within a VM, don't bind host drivers. It's difficult
> > to make promises when dynamically switching between host and userspace
> > drivers for devices that don't have functional reset mechanisms.
> > Thanks,
>
> Additionally, operating the isolated device in a VM on a constrained
> environment like a laptop may have other adverse side effects. The
> driver in the guest would ideally know that this is a laptop and needs
> to properly interact with APCI to handle power management on the
> device. If that is not the case, the driver in the guest may end up
> running the device out of spec with what the platform supports. It's
> also likely to break suspend and resume, especially on systems which
> use S0ix since the firmware will generally only turn off certain power
> rails if all of the devices on the rails have been put into the proper
> state. That state may vary depending on the platform requirements.

Good point, devices with platform dependencies to manage thermal
budgets, etc. should be considered "use at your own risk" relative to
device assignment currently. Thanks,

Alex