Re: [PATCH 09/10] PCI: tegra: Add Tegra194 PCIe support

From: Liviu Dudau
Date: Wed Apr 10 2019 - 07:35:07 EST


On Wed, Apr 10, 2019 at 03:23:39PM +0530, Vidya Sagar wrote:
> On 4/10/2019 1:44 PM, Liviu Dudau wrote:
> > On Wed, Apr 10, 2019 at 11:40:40AM +0530, Vidya Sagar wrote:
> > > On 4/9/2019 6:56 PM, Bjorn Helgaas wrote:
> > > > On Tue, Apr 09, 2019 at 05:00:53PM +0530, Vidya Sagar wrote:
> > > > > On 4/6/2019 12:28 AM, Bjorn Helgaas wrote:
> > > > > > On Fri, Apr 05, 2019 at 01:23:51AM +0530, Vidya Sagar wrote:
> > > > > > > On 4/3/2019 11:06 PM, Bjorn Helgaas wrote:
> > > > > > > > On Wed, Apr 03, 2019 at 03:13:09PM +0530, Vidya Sagar wrote:
> > > > > > > > > On 4/3/2019 12:01 AM, Bjorn Helgaas wrote:
> > > > > > > > > > On Tue, Apr 02, 2019 at 12:47:48PM +0530, Vidya Sagar wrote:
> > > > > > > > > > > On 3/30/2019 2:22 AM, Bjorn Helgaas wrote:
> > > > > > > > > > > > On Tue, Mar 26, 2019 at 08:43:26PM +0530, Vidya Sagar wrote:
> > > > > > > > > > > > > Add support for Synopsys DesignWare core IP based PCIe host controller
> > > > > > > > > > > > > present in Tegra194 SoC.
> > > > > > > > > >
> > > > > > > > > > - Why does this chip require pcie_pme_disable_msi()? The only other
> > > > > > > > > > use is a DMI quirk for "MSI Wind U-100", added by c39fae1416d5
> > > > > > > > > > ("PCI PM: Make it possible to force using INTx for PCIe PME
> > > > > > > > > > signaling").
> > > > > > > > >
> > > > > > > > > Because Tegra194 doesn't support raising PME interrupts through MSI line.
> > > >
> > > > > > There's something wrong here. Either the question of how PME is
> > > > > > signaled is generic and the spec provides a way for the OS to discover
> > > > > > that method, or it's part of the device-specific architecture that
> > > > > > each host bridge driver has to know about its device. If the former,
> > > > > > we need to make the PCI core smart enough to figure it out. If the
> > > > > > latter, we need a better interface than this ad hoc
> > > > > > pcie_pme_disable_msi() thing. But if it is truly the latter, your
> > > > > > current code is sufficient and we can refine it over time.
> > > > >
> > > > > In case of Tegra194, it is the latter case.
> > > >
> > > > This isn't a Tegra194 question; it's a question of whether this
> > > > behavior is covered by the PCIe spec.
> > > AFAIU the spec and what I heard from Nvidia hardware folks is that spec doesn't
> > > explicitly talk about this and it was a design choice made by Nvidia hardware
> > > folks to route these interrupts through legacy line instead of MSI line.
> > >
> > > >
> > > > > > What I suspect should happen eventually is the DWC driver should call
> > > > > > devm_pci_alloc_host_bridge() directly, as all the non-DWC drivers do.
> > > > > > That would require a little reorganization of the DWC data structures,
> > > > > > but it would be good to be more consistent.
> > > > > >
> > > > > > For now, I think stashing the pointer in pcie_port or dw_pcie would be
> > > > > > OK. I'm not 100% clear on the difference, but it looks like either
> > > > > > should be common across all the DWC drivers, which is what we want.
> > > > >
> > > > > Since dw_pcie is common for both root port and end point mode structures,
> > > > > I think it makes sense to keep the pointer in pcie_port as this structure
> > > > > is specific to root port mode of operation.
> > > > > I'll make a note to reorganize code to have devm_pci_alloc_host_bridge()
> > > > > used in the beginning itself to be inline with non-DWC implementations.
> > > > > But, I'll take it up later (after I'm done with upstreaming current series)
> > > >
> > > > Fair enough.
> > > >
> > > > > > > .remove() internally calls pm_runtime_put_sync() API which calls
> > > > > > > .runtime_suspend(). I made a new patch to add a host_deinit() call
> > > > > > > which make all these calls. Since host_init() is called from inside
> > > > > > > .runtime_resume() of this driver, to be in sync, I'm now calling
> > > > > > > host_deinit() from inside .runtime_suspend() API.
> > > > > >
> > > > > > I think this is wrong. pci_stop_root_bus() will detach all the
> > > > > > drivers from all the devices. We don't want to do that if we're
> > > > > > merely runtime suspending the host bridge, do we?
> > > > >
> > > > > In the current driver, the scenarios in which .runtime_suspend() is called
> > > > > are
> > > > > a) during .remove() call and
> > > >
> > > > It makes sense that you should call pci_stop_root_bus() during
> > > > .remove(), i.e., when the host controller driver is being removed,
> > > > because the PCI bus will no longer be accessible. I think you should
> > > > call it *directly* from tegra_pcie_dw_remove() because that will match
> > > > what other drivers do.
> > > >
> > > > > b) when there is no endpoint found and controller would be shutdown
> > > > > In both cases, it is required to stop the root bus and remove all devices,
> > > > > so, instead of having same call present in respective paths, I kept them
> > > > > in .runtime_suspend() itself to avoid code duplication.
> > > >
> > > > I don't understand this part. We should be able to runtime suspend
> > > > the host controller without detaching drivers for child devices.
> > > >
> > > > If you shutdown the controller completely and detach the *host
> > > > controller driver*, sure, it makes sense to detach drivers from child
> > > > devices. But that would be handled by the host controller .remove()
> > > > method, not by the runtime suspend method.
> > > I think it is time I give some background about why I chose to implement
> > > .runtime_suspend() and .runtime_resume() APIs in the first place. We wanted to
> > > powerdown PCIe controller if there is no link up (i.e. slot is open and no endpoint
> > > devices are connected). We want to achieve this without returning a failure in
> > > .probe() call. Given PCIe IP power partitioning is handled by generic power domain
> > > framework, power partition gets unpowergated before .probe() gets called and gets
> > > powergated either when a failure is returned in .probe() or when pm_runtime_put_sync()
> > > is called. So, I chose to call pm_runtime_put_sync() in no-link-up scenario and chose
> > > to implement .runtime_suspend() to handle all the cleanup work before PCIe partition
> > > getting powergated. In fact, to match this, I'm doing all the PCIe IP bring up
> > > activity in .runtime_resume() implementation which gets invoked by pm_runtime_get_sync()
> > > which in turn is called in .probe() path. In fact the very dw_pcie_host_init() itself
> > > is called from .runtime_resume() implementation. So, it is because of these reasons that
> > > I called pci_stop_root_bus() and pci_remove_root_bus() as part of .runtime_suspend()
> > > implementation as pm_runtime_put_sync() is called from both .remove() and also during
> > > no-link-up scenario. Please do let me know if there is a better way to do this.
> >
> > I think you're missing the case where .runtime_suspend() is called when
> > there are no *active* devices on the bus, i.e. everyone is dormant. It
> > doesn't mean that you need to remove them from the bus and re-probe them
> > back on .runtime_resume(). Most of the drivers for PCI devices don't
> > expect to be removed during idle, as they will configure the hardware to
> > be in a "quick wake" state (see PCIe Dx power states).
> >
> > You should probe and configure the bus during .probe() and remove and
> > detach all drivers during .remove(). For .runtime_suspend() all you need
> > to do is put the host controller in low power mode if it has one, or
> > stop all clocks that are not required for responding to devices waking
> > up from PCIe Dx state. For .runtime_resume() you then restore the
> > clocks, without re-scanning the bus.
> Since this is a host controller driver and the device as such is sitting on platform
> bus instead of PCIe bus, is it still the case that .runtime_suspend() and
> .runtime_resume() of this driver get called when devices on PCIe bus are idle?

The functions will be called when the device is determined to be idle,
i.e. when there are no PM references being held by other drivers.

Think of it the other way: even if the device is sitting on the platform
bus for configuration reasons, you don't want to turn it off when the
PCIe bus needs to be active, right?


>
> Having asked that, I start to feel what I'm doing as part of .runtime_suspend() and
> .runtime_resume() doesn't really justify these API names. Since I know where I'm
> calling pm_runtime_get/put_sync() APIs (which eventually call .runtime_resume/suspend())
> I should probably move the content of these APIs before calling pm_runtime_get/put_sync().
> Do you agree with that?

Yeah, I think you are right. Also, I believe there are pm_runtime_get()
calls in the pci framework as well, you need to audit the code, I
haven't looked at it in a while.

Best regards,
Liviu

>
> >
> > Best regards,
> > Liviu
> >
> >
> > >
> > > >
> > > > Bjorn
> > > >
> > >
> >
>

--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
Â\_(ã)_/Â