Re: [PATCH net-next v8 4/6] Add PCI driver support for BCM8958x
From: Jitendra Vegiraju
Date: Fri Mar 27 2026 - 14:16:37 EST
Hi Russell,
On Thu, Mar 26, 2026 at 9:56 AM Russell King (Oracle)
<linux@xxxxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 20, 2026 at 02:19:19PM -0700, Jitendra Vegiraju wrote:
> > +static const struct property_entry fixed_link_properties[] = {
> > + PROPERTY_ENTRY_U32("speed", 10000),
> > + PROPERTY_ENTRY_BOOL("full-duplex"),
> > + PROPERTY_ENTRY_BOOL("pause"),
> > + { }
> > +};
> > +
> > +static const struct software_node parent_swnode = {
> > + .name = "phy-device",
> > +};
> > +
> > +static const struct software_node fixed_link_swnode = {
> > + .name = "fixed-link", /* MUST be named "fixed-link" */
> > + .parent = &parent_swnode,
> > + .properties = fixed_link_properties,
> > +};
> > +
> > +static const struct software_node *brcm_swnodes[] = {
> > + &parent_swnode,
> > + &fixed_link_swnode,
> > + NULL
> > +};
>
> Looking at this structure, I'm not sure it's correct. You seem to have:
>
> pci_device
> - "phy-device" swnode attached here (which describes the PCI device,
> which isn't any kind of PHY)
> - "fixed-link" attached as a child
>
> The "fixed-link" is a property for the local network device which
> signifies that there isn't a PHY attached or there's an inaccessible
> PHY that only operates with one set of settings.
>
> Maybe rename "phy-device" to "ethernet"?
>
Sure, that make sense. I will rename it as "ethernet"
> > +
> > +struct brcm_priv_data {
> > + void __iomem *mbox_regs; /* MBOX Registers*/
> > + void __iomem *misc_regs; /* MISC Registers*/
> > + void __iomem *xgmac_regs; /* XGMAC Registers*/
> > +};
> > +
> > +struct dwxgmac_brcm_pci_info {
> > + int (*setup)(struct pci_dev *pdev, struct plat_stmmacenet_data *plat);
> > +};
> > +
> > +static void misc_iowrite(struct brcm_priv_data *brcm_priv,
> > + u32 reg, u32 val)
> > +{
> > + iowrite32(val, brcm_priv->misc_regs + reg);
> > +}
> > +
> > +static void dwxgmac_brcm_common_default_data(struct plat_stmmacenet_data *plat)
> > +{
> > + int i;
> > +
> > + plat->force_sf_dma_mode = true;
> > + plat->mac_port_sel_speed = SPEED_10000;
> > + plat->clk_ptp_rate = 125000000;
> > + plat->clk_ref_rate = 250000000;
> > + plat->tx_coe = true;
> > + plat->rx_coe = STMMAC_RX_COE_TYPE1;
> > + plat->rss_en = 1;
> > + plat->max_speed = SPEED_10000;
> > +
> > + /* Set default value for multicast hash bins */
> > + plat->multicast_filter_bins = HASH_TABLE_SIZE;
>
> Already the default setup by stmmac_plat_dat_alloc().
Ack.
>
> > +
> > + /* Set default value for unicast filter entries */
> > + plat->unicast_filter_entries = 1;
>
> Already the default setup by stmmac_plat_dat_alloc().
>
Ack
> > +
> > + /* Set the maxmtu to device's default */
> > + plat->maxmtu = BRCM_MAX_MTU;
> > +
> > + /* Set default number of RX and TX queues to use */
> > + plat->tx_queues_to_use = BRCM_TX_Q_COUNT;
> > + plat->rx_queues_to_use = BRCM_RX_Q_COUNT;
> > +
> > + plat->tx_sched_algorithm = MTL_TX_ALGORITHM_SP;
> > + for (i = 0; i < plat->tx_queues_to_use; i++) {
> > + plat->tx_queues_cfg[i].use_prio = false;
>
> Already false.
>
Ack
> > + plat->tx_queues_cfg[i].prio = 0;
>
> Already zero.
>
Ack
> > + plat->tx_queues_cfg[i].mode_to_use = MTL_QUEUE_AVB;
>
> Since MTL_QUEUE_AVB is zero, this is already the case.
>
> > + }
>
> All three points taken together mean that this loop is not required
> as all these members are being explicitly set to values of zero,
> which they already hold.
>
Ack
> > +
> > + plat->rx_sched_algorithm = MTL_RX_ALGORITHM_SP;
> > + for (i = 0; i < plat->rx_queues_to_use; i++) {
> > + plat->rx_queues_cfg[i].use_prio = false;
>
> Already false.
>
Ack
> > + plat->rx_queues_cfg[i].mode_to_use = MTL_QUEUE_AVB;
>
> Since MTL_QUEUE_AVB is zero, this is already the case.
>
Ack
> > + plat->rx_queues_cfg[i].pkt_route = 0x0;
>
> Already zero.
>
Ack
> > + plat->rx_queues_cfg[i].chan = i;
>
> stmmac_plat_dat_alloc() already initialises plat->rx_queues_cfg[].chan.
>
> > + }
>
> Taking all these points together, it means that this loop also isn't
> required, since you're not changing anything that hasn't already been
> setup.
>
True, That will eliminate some redundant lines.
> > +}
> > +
> > +static int dwxgmac_brcm_default_data(struct pci_dev *pdev,
> > + struct plat_stmmacenet_data *plat)
> > +{
> > + /* Set common default data first */
> > + dwxgmac_brcm_common_default_data(plat);
> > + plat->core_type = DWMAC_CORE_25GMAC;
> > + plat->bus_id = 0;
>
> The underlying devm_kzalloc() which allocates "plat" will clear the
> struct to zeros, so this assignment to bus_id shouldn't be necessary.
>
> > + plat->phy_addr = 0;
>
> You said there's no MDIO bus, so I don't think you need to initialise
> plat->phy_addr. stmmac_plat_dat_alloc() will set this to -1.
>
Ack
> > + plat->phy_interface = PHY_INTERFACE_MODE_XGMII;
> > +
> > + plat->dma_cfg->pbl = DEFAULT_DMA_PBL;
> > + plat->dma_cfg->pblx8 = true;
> > + plat->dma_cfg->aal = false;
> > + plat->dma_cfg->eame = true;
> > +
> > + plat->axi->axi_wr_osr_lmt = 31;
> > + plat->axi->axi_rd_osr_lmt = 31;
> > + plat->axi->axi_fb = false;
>
> devm_kzalloc() which is used to allocate plat->axi in the probe function
> will zero out this structure, so axi_fb will already be false.
>
Ack
> > + plat->axi->axi_blen_regval = DMA_AXI_BLEN64;
> > + return 0;
> > +}
> > +
> > +static struct dwxgmac_brcm_pci_info dwxgmac_brcm_pci_info = {
> > + .setup = dwxgmac_brcm_default_data,
> > +};
>
> It looks to me like this is a copy of stmmac_pci.c / dwmac-intel.c etc.
> Do you know for certain that you're going to need to do different
> setups depending on the PCI device?
>
> What's the reasoning for the split between
> dwxgmac_brcm_common_default_data() and dwxgmac_brcm_default_data() ?
>
We intend to eventually add support for another device with a
different setup function.
If the preference is to keep it simple until the additional
indirection is needed, I will
remove the setup function abstraction.
> > +
> > +static void brcm_config_misc_regs(struct pci_dev *pdev,
> > + struct brcm_priv_data *brcm_priv)
> > +{
> > + pci_write_config_dword(pdev, XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_LOW,
> > + XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_LO_VALUE);
> > + pci_write_config_dword(pdev, XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_HIGH,
> > + XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_HI_VALUE);
> > +
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_LO_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_LO_VALUE);
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_HI_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_HI_VALUE);
> > +
> > + /* Enable Switch Link */
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MII_CTRL_OFFSET,
> > + XGMAC_PCIE_MISC_MII_CTRL_PAUSE_RX |
> > + XGMAC_PCIE_MISC_MII_CTRL_PAUSE_TX |
> > + XGMAC_PCIE_MISC_MII_CTRL_LINK_UP);
> > +}
> > +
> > +static int brcm_config_multi_msi(struct pci_dev *pdev,
> > + struct plat_stmmacenet_data *plat,
> > + struct stmmac_resources *res)
> > +{
> > + int ret;
> > + int i;
> > +
> > + ret = pci_alloc_irq_vectors(pdev, BRCM_XGMAC_MSI_VECTOR_MAX,
> > + BRCM_XGMAC_MSI_VECTOR_MAX,
> > + PCI_IRQ_MSI | PCI_IRQ_MSIX);
> > + if (ret < 0) {
> > + dev_err(&pdev->dev, "%s: multi MSI enablement failed\n",
> > + __func__);
> > + return ret;
> > + }
> > +
> > + /* For RX MSI */
> > + for (i = 0; i < plat->rx_queues_to_use; i++)
> > + res->rx_irq[i] =
> > + pci_irq_vector(pdev,
> > + BRCM_XGMAC_MSI_RX_VECTOR_START + i * 2);
> > +
> > + /* For TX MSI */
> > + for (i = 0; i < plat->tx_queues_to_use; i++)
> > + res->tx_irq[i] =
> > + pci_irq_vector(pdev,
> > + BRCM_XGMAC_MSI_TX_VECTOR_START + i * 2);
> > +
> > + res->irq = pci_irq_vector(pdev, BRCM_XGMAC_MSI_MAC_VECTOR);
> > +
> > + plat->flags |= STMMAC_FLAG_MULTI_MSI_EN;
> > + plat->flags |= STMMAC_FLAG_TSO_EN;
> > + plat->flags |= STMMAC_FLAG_SPH_DISABLE;
> > + return 0;
> > +}
> > +
> > +static int brcm_pci_resume(struct device *dev, void *bsp_priv)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + brcm_config_misc_regs(pdev, bsp_priv);
>
> Is it worth declaring struct pdev for one place that it's used?
>
> brcm_config_misc_regs(to_pci_dev(dev), bsp_priv);
>
> should work just as well.
>
Ack. Will change it.
> > +
> > + return stmmac_pci_plat_resume(dev, bsp_priv);
> > +}
> > +
> > +static int dwxgmac_brcm_pci_probe(struct pci_dev *pdev,
> > + const struct pci_device_id *id)
> > +{
> > + struct dwxgmac_brcm_pci_info *info =
> > + (struct dwxgmac_brcm_pci_info *)id->driver_data;
> > + struct plat_stmmacenet_data *plat;
> > + struct brcm_priv_data *brcm_priv;
> > + struct stmmac_resources res;
> > + struct device *dev;
> > + int rx_offset;
> > + int tx_offset;
> > + int vector;
> > + int ret;
> > +
> > + dev = &pdev->dev;
>
> As you go to the effort of declaring a struct device pointer, and
> assign it, do you think it would be a good idea to either use it for
> all &pdev->dev instances below, or just get rid of the two instances
> that you actually use "dev" ?
>
> I count six instances of "&pdev->dev" below vs two making use of "dev"
> directly.
>
Ack. I will change it.
> > +
> > + brcm_priv = devm_kzalloc(&pdev->dev, sizeof(*brcm_priv), GFP_KERNEL);
> > + if (!brcm_priv)
> > + return -ENOMEM;
> > +
> > + plat = stmmac_plat_dat_alloc(dev);
> > + if (!plat)
> > + return -ENOMEM;
> > +
> > + plat->axi = devm_kzalloc(&pdev->dev, sizeof(*plat->axi), GFP_KERNEL);
> > + if (!plat->axi)
> > + return -ENOMEM;
> > +
> > + /* This device is directly attached to the switch chip internal to the
> > + * SoC using XGMII interface. Since no MDIO is present, register
> > + * fixed-link software_node to create phylink.
> > + */
> > + software_node_register_node_group(brcm_swnodes);
> > + device_set_node(dev, software_node_fwnode(&parent_swnode));
> > +
> > + /* Disable D3COLD as our device does not support it */
> > + pci_d3cold_disable(pdev);
> > +
> > + /* Enable PCI device */
> > + ret = pcim_enable_device(pdev);
> > + if (ret) {
> > + dev_err(&pdev->dev, "%s: ERROR: failed to enable device\n",
> > + __func__);
> > + return ret;
>
> What about cleaning up the swnodes ?
>
As you commented on patch5, I will squash the two patches.
> > + }
> > +
> > + pci_set_master(pdev);
> > +
> > + memset(&res, 0, sizeof(res));
> > + res.addr = pcim_iomap_region(pdev, 0, pci_name(pdev));
> > + if (IS_ERR(res.addr))
> > + return dev_err_probe(&pdev->dev, PTR_ERR(res.addr),
> > + "failed to map IO region\n");
>
> Convention is to have a blank line here.
>
Ack.
> > + /* MISC Regs */
> > + brcm_priv->misc_regs = res.addr + BRCM_XGMAC_IOMEM_MISC_REG_OFFSET;
> > + /* MBOX Regs */
> > + brcm_priv->mbox_regs = res.addr + BRCM_XGMAC_IOMEM_MBOX_REG_OFFSET;
> > + /* XGMAC config Regs */
> > + res.addr += BRCM_XGMAC_IOMEM_CFG_REG_OFFSET;
> > + brcm_priv->xgmac_regs = res.addr;
> > +
> > + plat->suspend = stmmac_pci_plat_suspend;
> > + plat->resume = brcm_pci_resume;
> > + plat->bsp_priv = brcm_priv;
> > +
> > + ret = info->setup(pdev, plat);
> > + if (ret)
> > + return ret;
>
> What about cleaning up the swnodes ?
>
Patch 5 addressed this.
> > +
> > + pci_write_config_dword(pdev, XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_LOW,
> > + XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_LO_VALUE);
> > + pci_write_config_dword(pdev, XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_HIGH,
> > + XGMAC_PCIE_CFG_MSIX_ADDR_MATCH_HI_VALUE);
> > +
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_LO_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_LO_VALUE);
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_HI_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_ADDR_MATCH_HI_VALUE);
> > +
> > + /* SBD Interrupt */
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_SBD_ALL_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_SBD_ALL_VALUE);
> > + /* EP_DOORBELL Interrupt */
> > + misc_iowrite(brcm_priv,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST_DBELL_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST_DBELL_VALUE);
> > + /* EP_H0 Interrupt */
> > + misc_iowrite(brcm_priv,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST0_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST0_VALUE);
> > + /* EP_H1 Interrupt */
> > + misc_iowrite(brcm_priv,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST1_OFFSET,
> > + XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_EP2HOST1_VALUE);
> > +
> > + rx_offset = XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_RX0_PF0_OFFSET;
> > + tx_offset = XGMAC_PCIE_MISC_MSIX_VECTOR_MAP_TX0_PF0_OFFSET;
> > + vector = BRCM_XGMAC_MSI_RX_VECTOR_START;
> > + for (int i = 0; i < BRCM_MAX_DMA_CHANNEL_PAIRS; i++) {
> > + /* RX Interrupt */
> > + misc_iowrite(brcm_priv, rx_offset, vector++);
> > + /* TX Interrupt */
> > + misc_iowrite(brcm_priv, tx_offset, vector++);
> > + rx_offset += 4;
> > + tx_offset += 4;
> > + }
>
> It looks like this device can program the MSI vector numbers. Does
> it make sense to interleave them, or would it be simpler to have
> all the receive vectors and then all the transmit vectors?
>
> This also hard-codes the fact that BRCM_XGMAC_MSI_TX_VECTOR_START
> is one more than BRCM_XGMAC_MSI_RX_VECTOR_START, which isn't nice
> given that you use these macros when claiming the MSI vectors.
>
Thanks for your input. I will remove the vector interleaving.
> > +
> > + /* Enable Switch Link */
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_MII_CTRL_OFFSET,
> > + XGMAC_PCIE_MISC_MII_CTRL_PAUSE_RX |
> > + XGMAC_PCIE_MISC_MII_CTRL_PAUSE_TX |
> > + XGMAC_PCIE_MISC_MII_CTRL_LINK_UP);
> > + /* Enable MSI-X */
> > + misc_iowrite(brcm_priv, XGMAC_PCIE_MISC_PCIESS_CTRL_OFFSET,
> > + XGMAC_PCIE_MISC_PCIESS_CTRL_EN_MSI_MSIX);
> > +
> > + ret = brcm_config_multi_msi(pdev, plat, &res);
> > + if (ret) {
> > + dev_err(&pdev->dev,
> > + "%s: ERROR: failed to enable IRQ\n", __func__);
> > + goto err_disable_msi;
> > + }
> > +
> > + ret = stmmac_dvr_probe(&pdev->dev, plat, &res);
> > + if (ret)
> > + goto err_disable_msi;
> > +
> > + return ret;
> > +
> > +err_disable_msi:
> > + pci_free_irq_vectors(pdev);
>
> This is still buggy. What about cleaning up the swnodes?
Patch 5 addressed this. I will squash patches 4 and 5 in V9.
>
> > +
> > + return ret;
> > +}
> > +
> > +static void dwxgmac_brcm_pci_remove(struct pci_dev *pdev)
> > +{
> > + stmmac_dvr_remove(&pdev->dev);
> > + pci_free_irq_vectors(pdev);
> > + device_set_node(&pdev->dev, NULL);
> > + software_node_unregister_node_group(brcm_swnodes);
>
> As the remove function does way more cleanup than the probe function,
> this is a sign that the probe function is buggy. This is exactly why
> I suggested using ->init and ->exit in the previous review. I seem
> to have been ignored on that though... and the problem I already
> pointed out remains.
As mentioned above, I will squash patches 4 and 5.
>
> Thanks.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature