Re: [PATCH v4 6/8] PCI/link: Re-add BW notification portdrv as PCIe BW controller
From: Ilpo Järvinen
Date: Tue Jan 09 2024 - 08:28:59 EST
On Tue, 9 Jan 2024, Krishna Chaitanya Chundru wrote:
> On 1/5/2024 4:55 PM, Ilpo Järvinen wrote:
> > This mostly reverts b4c7d2076b4e ("PCI/LINK: Remove bandwidth
> > notification") and builds PCIe bandwidth controller on top of it.
> >
> > The PCIe bandwidth notification were first added in the commit
> > e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
> > notification") but later had to be removed. The significant changes
> > compared with the old bandwidth notification driver include:
> >
> > 1) Don't print the notifications into kernel log, just keep the Link
> > Speed cached into the struct pci_bus updated. While somewhat
> > unfortunate, the log spam was the source of complaints that
> > eventually lead to the removal of the bandwidth notifications driver
> > (see the links below for further information).
> >
> > 2) Besides the Link Bandwidth Management Interrupt, enable also Link
> > Autonomous Bandwidth Interrupt to cover the other source of
> > bandwidth changes.
> >
> > 3) Use threaded IRQ with IRQF_ONESHOT to handle Bandwidth Notification
> > Interrupts to address the problem fixed in the commit 3e82a7f9031f
> > ("PCI/LINK: Supply IRQ handler so level-triggered IRQs are acked")).
> >
> > 4) Handle Link Speed updates robustly. Refresh the cached Link Speed
> > when enabling Bandwidth Notification Interrupts, and solve the race
> > between Link Speed read and LBMS/LABS update in
> > pcie_bandwidth_notification_irq_thread().
> >
> > 5) Use concurrency safe LNKCTL RMW operations.
> >
> > 6) The driver is now called PCIe bwctrl (bandwidth controller) instead
> > of just bandwidth notifications because of increased scope and
> > functionality within the driver.
> >
> > PCIe bandwidth controller introduces an in-kernel API to set PCIe Link
> > Speed. This new API is intended to be used in an upcoming commit that
> > adds a thermal cooling device to throttle PCIe bandwidth when thermal
> > thresholds are reached. No users are introduced in this commit yet.
> >
> > The PCIe bandwidth control procedure is as follows. The highest speed
> > supported by the Port and the PCIe device which is not higher than the
> > requested speed is selected and written into the Target Link Speed in
> > the Link Control 2 Register. Then bandwidth controller retrains the
> > PCIe Link.
> >
> > Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus
> > to keep track PCIe Link Speed changes. While Bandwidth Notifications
> > should also be generated when bandwidth controller alters the PCIe Link
> > Speed, a few platforms do not deliver LMBS interrupt after Link
> > Training as expected. Thus, after changing the Link Speed, bandwidth
> > controller makes additional read for the Link Status Register to ensure
> > cur_bus_speed is consistent with the new PCIe Link Speed.
> >
> > Link:
> > https://lore.kernel.org/all/20190429185611.121751-1-helgaas@xxxxxxxxxx/
> > Link:
> > https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@xxxxxxxxx/
> > Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@xxxxxxxxxx/
> > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > +/**
> > + * pcie_bwctrl_set_current_speed - Set downstream Link Speed for PCIe Port
> > + * @srv: PCIe Port
> > + * @speed_req: requested PCIe Link Speed
> > + *
> > + * Attempts to set PCIe Port Link Speed to @speed_req. @speed_req may be
> > + * adjusted downwards to the best speed supported by both the Port and PCIe
> > + * Device underneath it.
> > + *
> > + * Return:
> > + * * 0 - on success
> > + * * -EINVAL - @speed_req is not a PCIe Link Speed
> > + * * -ETIMEDOUT - changing Link Speed took too long
> > + * * -EAGAIN - Link Speed was changed but @speed_req was not achieved
> > + */
> > +int pcie_bwctrl_set_current_speed(struct pcie_device *srv, enum
> > pci_bus_speed speed_req)
>
> we want to use this API from PCIe client driver, but can't use API as client
> driver is not aware of pcie_device structure,
>
> can you please make changes in this API so that PCIe client drivers can also
> use it. And also can you please export this API.
I'll make v5 interface based on struct pci_dev *. It will require looking
up the internal data structure though.
--
i.