Re: [PATCH v2 5/5] arm64: dts: ti: Add TQ-Systems TQMa62xx SoM and MBa62xx carrier board Device Trees
From: Matthias Schiffer
Date: Wed Mar 19 2025 - 10:47:45 EST
On Wed, 2025-03-19 at 14:37 +0100, Andrew Lunn wrote:
> > Hi Andrew,
> >
> > I've just thought about this issue again. As mentioned, a number of MAC
> > drivers(*) implement what is described here:
> >
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/net/ethernet-controller.yaml?h=v6.13#n77
> >
> >
> > That is, the delay is either added by the PHY or the MAC; having a delay on the
> > PCB is simply not supported.
>
> Yes it is, you are reading it wrong. First off, as said a number of
> times in the last couple of months, this description is from the
> perspective of the PHY. It is not the board perspective. So:
>
> # RX and TX delays are added by the MAC when required
> - rgmii
>
> From the perspective of the PHY, "when required" the MAC adds the
> delay. It might not be required, because the PCB adds the delay.
I fail to see how the documentation could be interpreted that way. As you
explained, "rgmii" actually means that the delay is added by the PCB, so there
simply is no "when required" here - it can't be required if it's already in the
PCB. I stand by my opinion that this doc is plain wrong and not just worded
poorly, and needs to be fixed to express the intended meaning.
(I believe we agree on the point that this should be improved, I just don't
think it is appropriate to make statements like "few developers actually
understand RGMII delays" when the inaccuracy of the binding docs - supposedly an
authoritative resource on the meaning of DT properties - is the cause of the
misunderstanding)
>
> # RGMII with internal RX and TX delays provided by the PHY,
> # the MAC should not add the RX or TX delays in this case
> - rgmii-id
>
> The MAC should not add delays because the PHY does. But this implies
> the PCB cannot also be adding delays because you would end up with two
> sets of delays.
>
> > Fixing MAC drivers to interpret the values without
> > "id" to mean that there is a delay on the PCB will break existing Device Trees,
> > so that's no good.
>
> You need to look at it broken driver by broken driver. I _think_ the
> Aspeed case can be fixed. Others we need to look at the details. Maybe
> in some cases we do need to add a warning to the device tree binding
> the driver is FUBAR and has special, broken meaning for 'rgmii'.
I don't know about Aspeed (maybe you're mixing something up, or I'm not aware of
something?) - in the case of the TI driver I'm dealing with, this is indeed
possible to fix up because the MAC always adds a delay, it is not configurable.
>
> > As a first step, I'd update the docs to describe the intended behavior, but
> > mention that some drivers implement it wrong.
>
> Feel free to submit patches. However, please think about it. DT is OS
> agnostic. DT describes the hardware, not configuration. Other OSes
> might decide on a different policy about which of the MAC/PHY pair
> implements the delay, since that is configuration, not a hardware
> description.
>
> > The question is how to deal with
> > the wrong behavior going forward. I see the following options, but none spark
> > joy for me...
> >
> > - Keep current driver behavior forever where fixing it would break existing
> > Device Trees
> > - Deprecate all existing "rgmii*" values because of their inconsistent
> > implementation, come up with new ones. Seems like a huge pain to add support
> > for in all MAC drivers and other code that deals with PHY modes...
>
> Won't help anyway. If developers are getting it wrong now, why should
> they suddenly start getting it right when all you are changing is the
> name.
>
> The real issue is that you can combine two bugs to get a working
> system. It seems like few developers actually understand RGMII
> delays. They don't spend time to understand it, they just try random
> things until it works. As a result, they get systems where the MAC is
> unknowingly adding delays, and 'rgmii' works, so job done. They don't
> know or care it is wrong.
>
> > - Introduce an additional DTS flag next to phy-mode to express that the
> > phy-mode is supposed to be interpreted correctly
>
> If you are going to do add any sort of property, it should be a bool
> indicating the opposite, 'phy-mode-is-fubar'. Something which
> developers will think about before copying into their DT.
The whole point of my suggestions is to allow fixing the drivers while staying
compatible with old, broken Device Trees. If we could add a property 'phy-mode-
is-fubar' to old Device Trees, we could just fix them instead - but we can only
fix in-tree DTS, and it will only have an effect on systems that update their
DTBs with their kernel, so I believe this simply isn't an option because of
backwards compat guarantees for Device Trees.
>
> > Do you have any suggestions?
>
> Add a checkpatch.pl check. Any patch which adds a phy-mode !=
> 'rgmii-id' needs to have a comment on the previous line containing the
> word PCB. That should at least help stop more broken DT being added.
Ah, I like that idea - I'll see if I can find time to look into it.
Best,
Matthias
>
> Andrew
--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
https://www.tq-group.com/