Re: [PATCH 2/2] PCI: Create device link for NVIDIA GPU

From: Abhishek Sahu
Date: Tue Jun 04 2019 - 07:47:05 EST

On 6/3/2019 10:52 PM, Bjorn Helgaas wrote:
> [+cc Rafael, just FYI]
> On Mon, Jun 03, 2019 at 01:30:51PM +0530, Abhishek Sahu wrote:
>> On 6/1/2019 2:09 AM, Bjorn Helgaas wrote:
>>> On Fri, May 31, 2019 at 10:31:09AM +0530, Abhishek Sahu wrote:
>>>> NVIDIA Turing GPUs include hardware support for USB Type-C and
>>>> VirtualLink. It helps in delivering the power, display, and data
>>>> required to power VR headsets through a single USB Type-C connector.
>>>> The Turing GPU is a multi-function PCI device has the following
>>>> four functions:
>>>> - VGA display controller (Function 0)
>>>> - Audio controller (Function 1)
>>>> - USB xHCI Host controller (Function 2)
>>>> - USB Type-C USCI controller (Function 3)
>>>> The function 0 is tightly coupled with other functions in the
>>>> hardware. When function 0 goes in runtime suspended state,
>>>> then it will do power gating for most of the hardware blocks.
>>>> Some of these hardware blocks are used by other functions which
>>>> leads to functional failure. So if any of these functions (1/2/3)
>>>> are active, then function 0 should also be in active state.
>>>> 'commit 07f4f97d7b4b ("vga_switcheroo: Use device link for
>>>> HDA controller")' creates the device link from function 1 to
>>>> function 0. A similar kind of device link needs to be created
>>>> between function 0 and functions 2 and 3 for NVIDIA Turing GPU.
>>> I can't point to language that addresses this, but this sounds like a
>>> case of the GPU not conforming to the PCI spec. The general
>>> assumption is that the OS should be able to discover everything it
>>> needs to do power management directly from the architected PCI config
>>> space.
>> The GPU is following PCIe spec but following is the implementation
>> from HW side
> Unless you can find spec language that talks about D-state
> dependencies between functions, I claim this is not following the
> PCIe spec. For example, PCIe r5.0, sec 1.4, says "the PCI/PCIe
> hardware/software model includes architectural constructs necessary to
> discover, configure, and use a Function, without needing Function-
> specific knowledge." Sec 5.1 says "D states are associated with a
> particular Function" and "PM provides ... a mechanism to identify
> power management capabilities of a given Function [and] the ability to
> transition a Function into a certain power management state."

Thanks Bjorn. Here in case of GPU's these functions are not
completely independent so it is not following PCIe spec in
that aspect.

> If there *is* something about dependencies between functions in the
> spec, we should improve the generic PCI core to pay attention to that,
> and then we wouldn't need this quirk.
> If the spec doesn't provide a way to discover them, these dependencies
> are exceptions from the spec, and we have to handle them as hardware
> defects, using quirks like this. That's fine, but let's not pretend
> that this is a conforming device and that adding quirks is the
> expected process. Just call a spade a spade and say we're working
> around a defect in this particular device.

Yes. I am agree with that we need to be very careful in
adding quirks like this. I will communicate the same to
HW team so they can explore other options to handle this
in HW design side for future chips.

> I think the best path forward would be to add this quirk for the
> existing device, and then pursue a spec change to add something like
> a new PCIe capability to describe the dependencies. Then we could
> enhance the PCI core once and power management for future devices
> would "Just Work" without having to add quirks.

Yes. It will be long term process. If other HW has similar
requirement then it would be good to have this.