Re: [PATCH] net: bcmgenet: Fix attaching to PYH failed on RPi 4B

From: Jian-Hong Pan
Date: Tue Jun 22 2021 - 02:29:57 EST


Florian Fainelli <f.fainelli@xxxxxxxxx> 於 2021年6月22日 週二 上午5:47寫道:
>
> On 6/21/21 1:15 PM, Stefan Wahren wrote:
> > Am 21.06.21 um 18:56 schrieb Peter Robinson:
> >> On Mon, Jun 21, 2021 at 5:39 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
> >>> On 6/21/21 6:09 AM, Andrew Lunn wrote:
> >>>> On Mon, Jun 21, 2021 at 06:33:11PM +0800, Jian-Hong Pan wrote:
> >>>>> The Broadcom UniMAC MDIO bus comes too late. So, GENET cannot find the
> >>>>> ethernet PHY on UniMAC MDIO bus. This leads GENET fail to attach the
> >>>>> PHY.
> >>>>>
> >>>>> bcmgenet fd580000.ethernet: GENET 5.0 EPHY: 0x0000
> >>>>> ...
> >>>>> could not attach to PHY
> >>>>> bcmgenet fd580000.ethernet eth0: failed to connect to PHY
> >>>>> uart-pl011 fe201000.serial: no DMA platform data
> >>>>> libphy: bcmgenet MII bus: probed
> >>>>> ...
> >>>>> unimac-mdio unimac-mdio.-19: Broadcom UniMAC MDIO bus
> >>>>>
> >>>>> This patch makes GENET try to connect the PHY up to 3 times. Also, waits
> >>>>> a while between each time for mdio-bcm-unimac module's loading and
> >>>>> probing.
> >>>> Don't loop. Return -EPROBE_DEFER. The driver core will then probed the
> >>>> driver again later, by which time, the MDIO bus driver should of
> >>>> probed.
> >>> This is unlikely to work because GENET register the mdio-bcm-unimac
> >>> platform device so we will likely run into a chicken and egg problem,
> >>> though surprisingly I have not seen this on STB platforms where GENET is
> >>> used, I will try building everything as a module like you do. Can you
> >>> see if the following helps:
> >> For reference we have mdio_bcm_unimac/genet both built as modules in
> >> Fedora and I've not seen this issue reported using vanilla upstream
> >> kernels if that's a useful reference point.
> >
> > I was also unable to reproduce this issue, but it seems to be a known
> > issue [1], [2].
> >
> > Jian-Hong opened an issue in my Github repo [3], but before the issue
> > was narrowed down, he decided to send this workaround.
>
> The comment about changing the phy-mode property is not quite making
> sense to me, except if that means that in one case the Broadcom PHY
> driver is used and in the other case the Generic PHY driver is used.
>
> What is not clear to me from the debugging that has been done so far is
> whether the mdio-bcm-unimac MDIO controller was not loaded at the time
> of_phy_connect() was trying to identify the PHY device.

MODULE_SOFTDEP("pre: mdio-bcm-unimac") mentioned in the comment [1]
solves this issue.

Tracing the code by following the debug message in comment #2 [2], I
learned the path bcmgenet_mii_probe()'s of_phy_connect() ->
of_phy_find_device() -> of_mdio_find_device() ->
bus_find_device_by_of_node(). And, bus_find_device_by_of_node()
cannot find the device on the mdio bus.

So, I traced bcm2711-rpi-4-b's device tree to find out which one is
the mdio device and why it has not been prepared ready on the mdio bus
for genet.
Then, I found out it is mdio-bcm-unimac module as mentioned in comment
#4 [3]. Also, noticed "unimac-mdio unimac-mdio.-19: Broadcom UniMAC
MDIO bus" comes after "bcmgenet fd580000.ethernet eth0: failed to
connect to PHY" in the log.

With these findings, I try to re-modprobe genet module again. The
ethernet on RPi 4B works correctly! Also, noticed mdio-bcm-unimac
module is loaded before I re-modprobe genet module.
Therefore, I try to make mdio-bcm-unimac built in kernel image,
instead of a module. Then, genet always can find the mdio device on
the bus and the ethernet works as well.

Consequently, the idea, loading mdio-bcm-unimac module earlier than
genet module comes in my head! However, I don't know the key word
"MODULE_SOFTDEP" until Florian's guide. That is why I have a loop to
connect the PHY in the original patch. But, I understand
MODULE_SOFTDEP is a better solution now!

I think this is like the module loading order situation mentioned in
commit 11287b693d03 ("r8169: load Realtek PHY driver module before
r8169") [4].

[1] https://bugzilla.kernel.org/show_bug.cgi?id=213485#c6
[2] https://bugzilla.kernel.org/show_bug.cgi?id=213485#c2
[3] https://bugzilla.kernel.org/show_bug.cgi?id=213485#c4
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=11287b693d03830010356339e4ceddf47dee34fa

Jian-Hong Pan