Re: [PATCH v4 2/2] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

From: Alex Williamson
Date: Wed May 05 2021 - 15:14:07 EST


On Wed, 5 May 2021 23:10:32 +0530
Amey Narkhede <ameynarkhede03@xxxxxxxxx> wrote:

> On 21/05/05 01:56PM, Oliver O'Halloran wrote:
> > On Wed, May 5, 2021 at 12:50 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > >
> > > On Mon, May 03, 2021 at 09:07:11PM -0500, Shanker R Donthineni wrote:
> > > > On 5/3/21 5:42 PM, Bjorn Helgaas wrote:
> > > > > Obviously _RST only works for built-in devices, since there's no AML
> > > > > for plug-in devices, right? So if there's a plug-in card with this
> > > > > GPU, neither SBR nor _RST will work?
> > > > These are not plug-in PCIe GPU cards, will exist on upcoming server
> > > > baseboards. ACPI-reset should wok for plug-in devices as well as long
> > > > as firmware has _RST method defined in ACPI-device associated with
> > > > the PCIe hot-plug slot.
> > >
> > > Maybe I'm missing something, but I don't see how _RST can work for
> > > plug-in devices. _RST is part of the system firmware, and that
> > > firmware knows nothing about what will be plugged into the slot. So
> > > if system firmware supplies _RST that knows how to reset the Nvidia
> > > GPU, it's not going to do the right thing if you plug in an NVMe
> > > device instead.
> > >
> > > Can you elaborate on how _RST would work for plug-in devices?

I'm not sure I really understand these concerns about plug-in devices.
In this case I believe we're dealing with an embedded GPU, there is no
case where one of these GPUs would be a discrete device on a plug-in
card. I'm also assuming all SoCs integrating this GPU will provide a
_RST method, but we're also disabling SBR in this series to avoid the
only other generic reset option we'd have for this device.

In the more general case, I'd expect that system firmware isn't going
to implement an _RST method for a pluggable slot, so we'll lookup the
ACPI handle, fail to find a _RST method and drop to the next option.
For a PCI/e slot, at best the _RST method might be included in the _PRR
scope rather than the device scope to indicate it affects the entire
slot. That could be something like the #PERST below or a warm reset. I
don't think we're enabling that here, are we?

Otherwise system firmware would need to dynamically provide a _RST
method if it recognized and had support for the plugin card.

> > Power cycling the slot or just re-asserting #PERST probably. IBM has
> > been doing that on Power boxes since forever and it mostly works.
> > Mostly.
> According to ACPI spec v6.3 section 7.3.25, _RST just performs normal
> FLR in most cases but if the device supports _PRR(Power Resource for Reset)
> then reset operation causes the device to be reported as missing from the bus
> that indicates that it affects all the devices on the bus.

We're only looking for _RST on the device handle, so I think we're
limited to the device context limitations. Per the referenced section:

7.3.25 _RST (Device Reset)

This object executes a reset on the associated device or devices. If
^^
included in a device context, the reset must not affect any other
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ACPI-described devices; if included in a power resource for reset
^^^^^^^^^^^^^^^^^^^^^^
(_PRR, Section 7.3.26) the reset must affect all ACPI-described
devices that reference it.

When this object is described in a device context, it executes a
function level reset that only affects the device it is associated
with; neither parent nor children should be affected by the execution
of this reset. Executing this must only result in this device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
resetting without the device appearing as if it has been removed from
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the bus altogether, to prevent OSPM re-enumeration of devices on
^^^^^^^^^^^^^^^^^^
hot-pluggable buses (e.g. USB).

If a device reset is supported by the platform, but cannot meet the
function level and bus requirement, the device should instead
implement a _PRR (Section 7.3.26).

Devices can define both a _RST and a _PRR if supported by the
hardware.

Arguments: Non

Return Value: None


It's a bit unfortunate that they use the phrase "function level reset",
but since this method is not specific to a PCI device, I think this
could just as easily be replaced with "individual device scope reset".
The implementation of that could be an PCI FLR, or any number of device
or platform specific operations. To me this reads like a system
firmware provided, device specific reset. Thanks,

Alex