Re: [PATCH v9 6/6] PCI: qcom: Add OPP support to scale performance state of power domain

From: Manivannan Sadhasivam
Date: Mon Apr 08 2024 - 05:45:45 EST


On Mon, Apr 08, 2024 at 02:32:18PM +0530, Krishna Chaitanya Chundru wrote:
>
>
> On 4/7/2024 8:30 PM, Manivannan Sadhasivam wrote:
> > On Sun, Apr 07, 2024 at 10:07:39AM +0530, Krishna chaitanya chundru wrote:
> > > QCOM Resource Power Manager-hardened (RPMh) is a hardware block which
> > > maintains hardware state of a regulator by performing max aggregation of
> > > the requests made by all of the clients.
> > >
> > > PCIe controller can operate on different RPMh performance state of power
> > > domain based on the speed of the link. And this performance state varies
> > > from target to target, like some controllers support GEN3 in NOM (Nominal)
> > > voltage corner, while some other supports GEN3 in low SVS (static voltage
> > > scaling).
> > >
> > > The SoC can be more power efficient if we scale the performance state
> > > based on the aggregate PCIe link bandwidth.
> > >
> > > Add Operating Performance Points (OPP) support to vote for RPMh state based
> > > on the aggregate link bandwidth.
> > >
> > > OPP can handle ICC bw voting also, so move ICC bw voting through OPP
> > > framework if OPP entries are present.
> > >
> > > Different link configurations may share the same aggregate bandwidth,
> > > e.g., a 2.5 GT/s x2 link and a 5.0 GT/s x1 link have the same bandwidth
> > > and share the same OPP entry.
> > >
> >
> > This info should be part of the dts change.
> >
> ok I will move this to dts patch in next patch series.
> > > As we are moving ICC voting as part of OPP, don't initialize ICC if OPP
> > > is supported.
> > >
> > > Before PCIe link is initialized vote for highest OPP in the OPP table,
> > > so that we are voting for maximum voltage corner for the link to come up
> > > in maximum supported speed.
> > >
> > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@xxxxxxxxxxx>
> > > ---
> > > drivers/pci/controller/dwc/pcie-qcom.c | 72 +++++++++++++++++++++++++++-------
> > > 1 file changed, 58 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> > > index b4893214b2d3..4ad5ef3bf8fc 100644
> > > --- a/drivers/pci/controller/dwc/pcie-qcom.c
> > > +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> > > @@ -22,6 +22,7 @@
> > > #include <linux/of.h>
> > > #include <linux/of_gpio.h>
> > > #include <linux/pci.h>
> > > +#include <linux/pm_opp.h>
> > > #include <linux/pm_runtime.h>
> > > #include <linux/platform_device.h>
> > > #include <linux/phy/pcie.h>
> > > @@ -1442,15 +1443,13 @@ static int qcom_pcie_icc_init(struct qcom_pcie *pcie)
> > > return 0;
> > > }
> > > -static void qcom_pcie_icc_update(struct qcom_pcie *pcie)
> > > +static void qcom_pcie_icc_opp_update(struct qcom_pcie *pcie)
> > > {
> > > struct dw_pcie *pci = pcie->pci;
> > > - u32 offset, status;
> > > + u32 offset, status, freq;
> > > + struct dev_pm_opp *opp;
> > > int speed, width;
> > > - int ret;
> > > -
> > > - if (!pcie->icc_mem)
> > > - return;
> > > + int ret, mbps;
> > > offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> > > status = readw(pci->dbi_base + offset + PCI_EXP_LNKSTA);
> > > @@ -1462,10 +1461,26 @@ static void qcom_pcie_icc_update(struct qcom_pcie *pcie)
> > > speed = FIELD_GET(PCI_EXP_LNKSTA_CLS, status);
> > > width = FIELD_GET(PCI_EXP_LNKSTA_NLW, status);
> > > - ret = icc_set_bw(pcie->icc_mem, 0, width * QCOM_PCIE_LINK_SPEED_TO_BW(speed));
> > > - if (ret) {
> > > - dev_err(pci->dev, "failed to set interconnect bandwidth for PCIe-MEM: %d\n",
> > > - ret);
> > > + if (pcie->icc_mem) {
> > > + ret = icc_set_bw(pcie->icc_mem, 0, width * QCOM_PCIE_LINK_SPEED_TO_BW(speed));
> > > + if (ret) {
> > > + dev_err(pci->dev, "failed to set interconnect bandwidth for PCIe-MEM: %d\n",
> >
> > s/failed/Failed
> >
> > > + ret);
> > > + }
> > > + } else {
> > > + mbps = pcie_link_speed_to_mbps(pcie_link_speed[speed]);
> > > + if (mbps < 0)
> > > + return;
> > > +
> > > + freq = mbps * 1000;
> > > + opp = dev_pm_opp_find_freq_exact(pci->dev, freq * width, true);
> >
> > As per the API documentation, dev_pm_opp_put() should be called for both success
> > and failure case.
> >
> ACK.
> > > + if (!IS_ERR(opp)) {
> >
> > So what is the action if OPP is not found for the freq?
> >
> There is already a vote for maximum freq in the probe, so if it fails
> here we can continue here.
> If you feel otherwise let me know I Can make changes as suggested.

You should just log the error and continue.

> > > + ret = dev_pm_opp_set_opp(pci->dev, opp);
> > > + if (ret)
> > > + dev_err(pci->dev, "Failed to set opp: freq %ld ret %d\n",
> >
> > 'Failed to set OPP for freq (%ld): %d'
> >
> > > + dev_pm_opp_get_freq(opp), ret);
> > > + dev_pm_opp_put(opp);
> > > + }
> > > }
> > > }
> > > @@ -1509,8 +1524,10 @@ static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
> > > static int qcom_pcie_probe(struct platform_device *pdev)
> > > {
> > > const struct qcom_pcie_cfg *pcie_cfg;
> > > + unsigned long max_freq = INT_MAX;
> > > struct device *dev = &pdev->dev;
> > > struct qcom_pcie *pcie;
> > > + struct dev_pm_opp *opp;
> > > struct dw_pcie_rp *pp;
> > > struct resource *res;
> > > struct dw_pcie *pci;
> > > @@ -1577,9 +1594,33 @@ static int qcom_pcie_probe(struct platform_device *pdev)
> > > goto err_pm_runtime_put;
> > > }
> > > - ret = qcom_pcie_icc_init(pcie);
> > > - if (ret)
> > > + /* OPP table is optional */
> > > + ret = devm_pm_opp_of_add_table(dev);
> > > + if (ret && ret != -ENODEV) {
> > > + dev_err_probe(dev, ret, "Failed to add OPP table\n");
> > > goto err_pm_runtime_put;
> > > + }
> > > +
> > > + /*
> > > + * Use highest OPP here if the OPP table is present. At the end of
> >
> > I believe I asked you to add the information justifying why the highest OPP
> > should be used.
> >
> I added the info in the commit message, I will add as the comment in the
> next patch.
>
> > > + * the probe(), OPP will be updated using qcom_pcie_icc_opp_update().
> > > + */
> > > + if (!ret) {
> > > + opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> >
> > Same comment as dev_pm_opp_find_freq_exact().
> >
> > > + if (!IS_ERR(opp)) {
> > > + ret = dev_pm_opp_set_opp(dev, opp);
> > > + if (ret)
> > > + dev_err_probe(pci->dev, ret,
> > > + "Failed to set OPP: freq %ld\n",
> >
> > Same comment as above.
> >
> > > + dev_pm_opp_get_freq(opp));
> > > + dev_pm_opp_put(opp);
> >
> > So you want to continue even in the case of failure?
> >
> I wil make changes to fallback to driver voting for icc bw if it fails here.

That's not needed. If the OPP table is present, then failure to set OPP should
be treated as a hard failure.

- Mani

--
மணிவண்ணன் சதாசிவம்