RE: [PATCH net-next v4 05/12] net: ethernet: oa_tc6: implement error interrupts unmasking
From: Piergiorgio Beruto
Date: Fri May 24 2024 - 16:47:10 EST
Hi all,
Just my two cents here...
Collision detection is a fundamental building block of the CSMA/CD mechanism.
The PHY detects physical collisions and reports them to the MAC via the COL pin.
The MAC is supposed to perform the normal back-off operation: stop the current transmission and re-transmit at a later time following the exponentially increasing random algorithm (See IEEE 802.3 Clause 4).
** Therefore, collision detect shall NOT be brought up to the user. In a MACPHY system it is supposed to be handled entirely by the PHY. **
With the introduction of PLCA, the PHY may report also "logical collisions" to the MAC. These are not real collisions as they don't happen on the line. They are part of the normal PLCA back-pressuring mechanism that allows the PHY to send a frame only during a specific transmit opportunity. So once more, this kind of collision shall not be reported to the user, it is just normal behavior.
However, physical collisions (i.e., collisions that really happens on the line) are NOT supposed to happen when PLCA is enabled and configured correctly.
For this reason we have a standard IEEE register (PCS Diagnostic 2) which still captures physical collisions (not logical ones). This register is supposed to be used as a diagnostic information to let the user know that something is misconfigured.
Now, some PHYs can be configured to provide IRQs when this kind of collision happens, indicating a configuration problem.
Additionally, many PHYs allows the user to completely switch off the PHY physical collision detection when PLCA is enabled. This is also recommended by the OPEN Alliance specifications, although no standard register was defined to achieve that (maybe we should add it to the standard...).
I'm not sure I followed all this discussion, I just hope this clarification might help in finding a good solution.
In short:
- Collisions are NOT to be reported to Linux. The MAC/PHY shall handle those internally.
- Physical collisions are still counted into a diagnostic register (could be good to add it to ethtool MAC statistics)
- Disabling collision detection is allowed only when PLCA is enabled, but it is not a standard feature, although automotive specs recommends it.
Thanks,
Piergiorgio
-----Original Message-----
From: Andrew Lunn <andrew@xxxxxxx>
Sent: 24 May, 2024 20:32
To: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@xxxxxxxxxxx>
Cc: Parthiban.Veerasooran@xxxxxxxxxxxxx; Piergiorgio Beruto <Pier.Beruto@xxxxxxxxxx>; davem@xxxxxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; horms@xxxxxxxxxx; saeedm@xxxxxxxxxx; anthony.l.nguyen@xxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; corbet@xxxxxxx; linux-doc@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx; krzysztof.kozlowski+dt@linaroorg; conor+dt@xxxxxxxxxx; devicetree@xxxxxxxxxxxxxxx; Horatiu.Vultur@xxxxxxxxxxxxx; ruanjinjie@xxxxxxxxxx; Steen.Hegelund@xxxxxxxxxxxxx; vladimir.oltean@xxxxxxx; UNGLinuxDriver@xxxxxxxxxxxxx; Thorsten.Kummermehr@xxxxxxxxxxxxx; Selvamani Rajagopal <Selvamani.Rajagopal@xxxxxxxxxx>; Nicolas.Ferre@xxxxxxxxxxxxx; benjamin.bigler@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH net-next v4 05/12] net: ethernet: oa_tc6: implement error interrupts unmasking
[External Email]: This email arrived from an external source - Please exercise caution when opening any attachments or clicking on links.
> After a considerable ammount of headscratching it seems that disabling
> collision detection on the macphy is the only way of getting it stable.
> When PLCA is enabled it's expected that CD causes problems, when
> running in CSMA/CD mode it was unexpected (for me at least).
Now we are back to, why is your system different? What is triggering a collision for you, but not Parthiban?
There is nothing in the standard about reporting a collision. So this is a Microchip extension? So the framework is not doing anything when it happens, which will explain why it becomes a storm.... Until we do have a mechanism to handle vendor specific interrupts, the frame work should disable them all, to avoid this storm.
Does the datasheet document what to do on a collision? How are you supposed to clear the condition?
Andrew