Re: Move MT7530 phy muxing from DSA to PHY driver
From: Vladimir Oltean
Date: Mon Mar 27 2023 - 14:38:31 EST
On Sun, Mar 26, 2023 at 07:52:12PM +0300, Arınç ÜNAL wrote:
> I'm currently working on the mt7530 driver and I think I found a way
> that takes the least effort, won't break the ABI, and most importantly,
> will work.
This sounds promising....
> As we acknowledged, the registers are in the switch address space. This
> must also mean that the switch must be properly probed before doing
> anything with the registers.
>
> I'm not sure how exactly making a tiny driver would work in this case.
I'm not sure how it would work, either. It sounds like the driver for
the mdio-bus address @1f should have been a parent driver (MFD or not)
with 2 (platform device) children, one for the switch and another for
the HWTRAP registers and whatever else might be needed for the PHY
multiplexing. The parent (mdio_device) driver deals with the chip-wide
reset, resources, and manages the registers, giving them regmaps.
The driver with the mux probably just exports a symbol representing a
function that gets called by the "mediatek,eth-mac" driver and/or the
switch driver.
BTW, I have something vaguely similar to this in a stalled WIP branch
for sja1105, but things like this get really complicated really quickly
if the DSA driver's DT bindings weren't designed from day one to not
cover the entire switch's register map.
> I figured we can just run the phy muxing code before the DSA driver
> exits because there are no (CPU) ports defined on the devicetree. Right
> after probing is done on mt7530_probe, before dsa_register_switch() is run.
Aren't there timing issues, though? When is the earliest moment that the
"mediatek,eth-mac" driver needs the HWTRAP muxing to be changed?
The operation of changing that from the "mediatek,mt7530" driver is
completely asynchronous to the probing of "mediatek,eth-mac".
What's the worst that will happen with incorrect (not yet updated) GMII
signal muxing? "Just" some lost packets?
>
> For proof of concept, I've moved some necessary switch probing code from
> mt7530_setup() to mt7530_probe(). After the switch is properly reset,
> phy4 is muxed, before dsa_register_switch() is run.
This is fragile because someone eager for some optimizations could move
the code back the way it was, and say: "the switch initialization costs
X ms and is done X times, because dsa_register_switch() -> ... ->
of_find_net_device_by_node() returns -EPROBE_DEFER the first X-1 times.
If we move the switch initialization to ds->ops->setup(), it will run
only once, after the handle to the DSA master has been obtained, and
this gives us a boost in kernel startup time."
It's even more fragile because currently (neither before nor after your change),
mt7530_remove() does not do the mirror opposite of mt7530_probe(), and somebody
eager from the future will notice this, and add an error handling path for
dsa_register_switch(), which calls the opposite of regulator_enable(),
regulator_disable(), saying "hey, there's no reason to let the regulators
on if the switch failed to probe, it consumes power for nothing!".
It's an open question whether that regulator is needed for anything after
the HWMUX registers has been changed, or if it can indeed be turned off.
Not knowing this, it's hard to say if the change is okay or not.
It seems that there's a high probability it will work for a while,
by coincidence.
>
> [ 0.650721] mt7530 mdio-bus:1f: MT7530 adapts as multi-chip module
> [ 0.660285] mt7530 mdio-bus:1f: muxing phy4 to gmac5
> [ 0.665284] mt7530 mdio-bus:1f: no ports child node found
> [ 0.670688] mt7530: probe of mdio-bus:1f failed with error -22
> [ 0.679118] mtk_soc_eth 1e100000.ethernet: generated random MAC address b6:9c:4d:eb:1f:8e
> [ 0.688922] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 15
>
> ---
>
> # ifup eth0
> [ 30.674595] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
> [ 30.683451] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
> # ping 192.168.2.2
> PING 192.168.2.2 (192.168.2.2): 56 data bytes
> 64 bytes from 192.168.2.2: seq=0 ttl=64 time=0.688 ms
> 64 bytes from 192.168.2.2: seq=1 ttl=64 time=0.375 ms
> 64 bytes from 192.168.2.2: seq=2 ttl=64 time=0.357 ms
> 64 bytes from 192.168.2.2: seq=3 ttl=64 time=0.323 ms
>
> ---
>
> # ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
> valid_lft forever preferred_lft forever
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
> link/ether b6:9c:4d:eb:1f:8e brd ff:ff:ff:ff:ff:ff
> inet 192.168.2.1/24 scope global eth0
> valid_lft forever preferred_lft forever
>
> There is a lot to do, such as fixing the method to read from the
> devicetree as it relies on the mac node the CPU port is connected to but
> when this is finalised, we should be able to use it like this:
>
> mac@1 {
> compatible = "mediatek,eth-mac";
> reg = <1>;
> phy-mode = "rgmii";
> phy-handle = <ðphy0>;
> };
>
> mdio-bus {
> #address-cells = <1>;
> #size-cells = <0>;
>
> ethphy0: ethernet-phy@0 {
> reg = <0>;
> };
>
> switch@1f {
> compatible = "mediatek,mt7530";
> reg = <0x1f>;
> reset-gpios = <&pio 33 0>;
> core-supply = <&mt6323_vpa_reg>;
> io-supply = <&mt6323_vemc3v3_reg>;
> };
> };
And this is fragile because the "mediatek,eth-mac" driver only works
because of the side effects of a driver that began to probe, and failed.
Someone, seeing that "mediatek,mt7530" fails to probe, and knowing that
the switch ports are not needed/used on that board, could put a
status = "disabled"; property under the switch@1f node.