Re: [PATCH v12 06/13] PCI: dwc: Add dw_pcie_parent_bus_offset() checking and debug
From: Frank Li
Date: Tue Mar 25 2025 - 15:07:18 EST
On Tue, Mar 25, 2025 at 11:39:15PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Mar 24, 2025 at 03:04:37PM -0500, Bjorn Helgaas wrote:
> > On Mon, Mar 24, 2025 at 11:00:24PM +0530, Manivannan Sadhasivam wrote:
> > > On Sat, Mar 15, 2025 at 03:15:41PM -0500, Bjorn Helgaas wrote:
> > > > From: Frank Li <Frank.Li@xxxxxxx>
> > > >
> > > > dw_pcie_parent_bus_offset() looks up the parent bus address of a PCI
> > > > controller 'reg' property in devicetree. If implemented, .cpu_addr_fixup()
> > > > is a hard-coded way to get the parent bus address corresponding to a CPU
> > > > physical address.
> > > >
> > > > Add debug code to compare the address from .cpu_addr_fixup() with the
> > > > address from devicetree. If they match, warn that .cpu_addr_fixup() is
> > > > redundant and should be removed; if they differ, warn that something is
> > > > wrong with the devicetree.
> > > >
> > > > If .cpu_addr_fixup() is not implemented, the parent bus address should be
> > > > identical to the CPU physical address because we previously ignored the
> > > > parent bus address from devicetree. If the devicetree has a different
> > > > parent bus address, warn about it being broken.
> > > >
> > > > [bhelgaas: split debug to separate patch for easier future revert, commit
> > > > log]
> > > > Link: https://lore.kernel.org/r/20250313-pci_fixup_addr-v11-5-01d2313502ab@xxxxxxx
> > > > Signed-off-by: Frank Li <Frank.Li@xxxxxxx>
> > > > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > > > ---
> > > > drivers/pci/controller/dwc/pcie-designware.c | 26 +++++++++++++++++++-
> > > > drivers/pci/controller/dwc/pcie-designware.h | 13 ++++++++++
> > > > 2 files changed, 38 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > > > index 0a35e36da703..985264c88b92 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > > > @@ -1114,7 +1114,8 @@ resource_size_t dw_pcie_parent_bus_offset(struct dw_pcie *pci,
> > > > struct device *dev = pci->dev;
> > > > struct device_node *np = dev->of_node;
> > > > int index;
> > > > - u64 reg_addr;
> > > > + u64 reg_addr, fixup_addr;
> > > > + u64 (*fixup)(struct dw_pcie *pcie, u64 cpu_addr);
> > > >
> > > > /* Look up reg_name address on parent bus */
> > > > index = of_property_match_string(np, "reg-names", reg_name);
> > > > @@ -1126,5 +1127,28 @@ resource_size_t dw_pcie_parent_bus_offset(struct dw_pcie *pci,
> > > >
> > > > of_property_read_reg(np, index, ®_addr, NULL);
> > > >
> > > > + fixup = pci->ops->cpu_addr_fixup;
> > > > + if (fixup) {
> > > > + fixup_addr = fixup(pci, cpu_phy_addr);
> > > > + if (reg_addr == fixup_addr) {
> > > > + dev_warn(dev, "%#010llx %s reg[%d] == %#010llx; %ps is redundant\n",
> > > > + cpu_phy_addr, reg_name, index,
> > > > + fixup_addr, fixup);
> > > > + } else {
> > > > + dev_warn(dev, "%#010llx %s reg[%d] != %#010llx fixed up addr; devicetree is broken\n",
> > > > + cpu_phy_addr, reg_name,
> > > > + index, fixup_addr);
> > > > + reg_addr = fixup_addr;
> > > > + }
> > > > + } else if (!pci->use_parent_dt_ranges) {
> > >
> > > Is this check still valid? 'use_parent_dt_ranges' is only used here for
> > > validation. Moreover, if the fixup is not available, we should be able to
> > > safely return 'cpu_phy_addr - reg_addr' unconditionally.
> >
> > Yes, that's true IF the devicetree has the correct 'ranges'
> > translation. This is to avoid breaking platforms with broken
> > devicetrees.
> >
>
> You mean the driver without cpu_addr_fixup() and devicetree with broken 'ranges'
> property? So the existing platforms... Not a bad idea though.
I worry about some platform without cpu_addr_fixup() use fake translation
by ranges property.
like
bus{
...
ranges = <fake_addr, real_addr, size>
pcie {
regs = <fake_addr + offset>
reg-names = "config".
...
}
}
So add above check, we will remove it if none report it for sometimes.
Frank
>
> > > > + if (reg_addr != cpu_phy_addr) {
> > > > + dev_warn(dev, "devicetree has incorrect translation; please check parent \"ranges\" property. CPU physical addr %#010llx, parent bus addr %#010llx\n",
> > > > + cpu_phy_addr, reg_addr);
> > > > + return 0;
> > > > + }
> > > > + }
> > > > +
> > > > + dev_info(dev, "%s parent bus offset is %#010llx\n",
> > > > + reg_name, cpu_phy_addr - reg_addr);
> > >
> > > This info is useless on platforms having no translation between CPU and PCI
> > > controller. The offset will always be 0.
> >
> > You're right. This was probably an overzealous message for any
> > possible issues.
> >
> > What would you think of the below as a replacement? It should emit at
> > most one message, and none for platforms where devicetree describes no
> > translation and there never was a .cpu_addr_fixup().
> >
> > It's still pretty aggressive logging, but I'm just concerned about
> > being able to quickly debug and fix any regressions. Ideally we can
> > revert the whole thing eventually.
> >
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > index 27b464a405a4..4b442d1aa55b 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > @@ -1114,7 +1114,8 @@ resource_size_t dw_pcie_parent_bus_offset(struct dw_pcie *pci,
> > struct device *dev = pci->dev;
> > struct device_node *np = dev->of_node;
> > int index;
> > - u64 reg_addr;
> > + u64 reg_addr, fixup_addr;
> > + u64 (*fixup)(struct dw_pcie *pcie, u64 cpu_addr);
> >
> > /* Look up reg_name address on parent bus */
> > index = of_property_match_string(np, "reg-names", reg_name);
> > @@ -1126,5 +1127,42 @@ resource_size_t dw_pcie_parent_bus_offset(struct dw_pcie *pci,
> >
> > of_property_read_reg(np, index, ®_addr, NULL);
> >
> > + fixup = pci->ops ? pci->ops->cpu_addr_fixup : NULL;
> > + if (fixup) {
> > + fixup_addr = fixup(pci, cpu_phys_addr);
> > + if (reg_addr == fixup_addr) {
> > + dev_info(dev, "%s reg[%d] %#010llx == %#010llx == fixup(cpu %#010llx); %ps is redundant with this devicetree\n",
> > + reg_name, index, reg_addr, fixup_addr,
> > + (unsigned long long) cpu_phys_addr, fixup);
> > + } else {
> > + dev_warn(dev, "%s reg[%d] %#010llx != %#010llx == fixup(cpu %#010llx); devicetree is broken\n",
> > + reg_name, index, reg_addr, fixup_addr,
> > + (unsigned long long) cpu_phys_addr);
> > + reg_addr = fixup_addr;
> > + }
> > +
> > + return cpu_phys_addr - reg_addr;
> > + }
> > +
> > + if (pci->use_parent_dt_ranges) {
> > +
> > + /*
> > + * This platform once had a fixup, presumably because it
> > + * translates between CPU and PCI controller addresses.
> > + * Log a note if devicetree didn't describe a translation.
> > + */
> > + if (reg_addr == cpu_phys_addr)
> > + dev_info(dev, "%s reg[%d] %#010llx == cpu %#010llx\n; no fixup was ever needed for this devicetree\n",
> > + reg_name, index, reg_addr,
> > + (unsigned long long) cpu_phys_addr);
>
> So this check is to detect the usage of old DTs with new kernel without
> cpu_addr_fixup()? If so:
>
> (1) The log is not accurate
> (2) The driver would be broken, so the log should be an error. This condition
> should not be met (if we do not remove the fixup for some time). But I think
> this check should be moved ahead of cpu_addr_fixup() so that the correct DTs
> would be honored first and the fixup would be ignored for them.
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்