Re: [RFC, net-next] net: qos: introduce a frer action to implement 802.1CB

From: Vladimir Oltean
Date: Fri May 06 2022 - 08:23:50 EST


Hi Ferenc,

(I adjusted the CC list)

On Fri, May 06, 2022 at 11:55:56AM +0000, Ferenc Fejes wrote:
> On 2021. 09. 28. 13:44, Xiaoliang Yang wrote:
> > This patch introduce a frer action to implement frame replication and
> > elimination for reliability, which is defined in IEEE P802.1CB.
>
> HiXiaoliang!
>
> thanks for your efforts to introduce afreraction to implement frame
> replication and elimination for reliability, which is defined in IEEE
> P802.1CB-2017. I would like to relay a small comment from our team,
> regarding to the FRER, not particularly to the code.
>
> Support of RTAG format is very straightforward.
>
> Since 2017, several maintenance items were opened regarding IEEE
> P802.1CB-2017 to fix some errors in the standard. Discussions results
> will be published soon e.g., in IEEE P802.1CBdb
> (https://1.ieee802.org/tsn/802-1cbdb/).
>
> One of the maintenance items impacts the vector recovery algorithm itself.
>
> Details on the problem and the solution are here:
>
> -https://www.802-1.org/items/370
>
> -https://www.ieee802.org/1/files/public/docs2020/maint-varga-257-FRER-recovery-window-0320-v01.pdf
> <https://www.ieee802.org/1/files/public/docs2020/maint-varga-257-FRER-recovery-window-0320-v01.pdf>
>
> It is a small but important fix. There is an incorrect reference to the
> size of the recovery window, when a received packet is checked to be
> out-of-range or not. Without this fix the vector recovery algorithm do
> not work properly in some scenarios.
>
> Please consider to update your patch to reflect the maintenance efforts
> of IEEE to correct .1CB-2017 related issues.
>
> > There are two modes for frer action: generate and push the tag, recover
> > and pop the tag. frer tag has three types: RTAG, HSR, and PRP. This
> > patch only supports RTAG now.
> >
> > User can push the tag on egress port of the talker device, recover and
> > pop the tag on ingress port of the listener device. When it's a relay
> > system, push the tag on ingress port, or set individual recover on
> > ingress port. Set the sequence recover on egress port.
> >
> > Use action "mirred" to do split function, and use "vlan-modify" to do
> > active stream identification function on relay system.
> >
> All of our research in the topic based on a in-house userspace FRER
> implementation but we are looking forward to test your work in the future.
>
> Thanks,
>
> Ferenc

Glad to see someone familiar with 802.1CB. I have a few questions and
concerns if you don't mind.

I think we are seeing a bit of a stall on the topic of FRER modeling in
the Linux networking stack, in no small part due to the fact that we are
working with pre-standard hardware.

The limitation with Xiaoliang's proposal here (to model FRER stream
replication and recovery as a tc action) is that I don't think it works
well for traffic termination - it only covers properly the use case of a
switch. More precisely, there isn't a single convergent termination
point for either locally originating traffic, or locally received
traffic (i.e. you, as user, don't know on which interface of several
available to open a socket).

In our hardware, this limitation isn't really visible because of the way
in which the Ethernet switch is connected inside the NXP LS1028A.
It is something like this:

+---------------------------------------+
| |
| +------+ +------+ |
| | eno2 | | eno3 | |
| +------+ +------+ |
| | | |
| +------+ +------+ |
| | swp4 | | swp5 | |
| +------+ +------+ |
| +------+ +------+ +------+ +------+ |
| | swp0 | | swp1 | | swp2 | | swp3 | |
+--+------+-+------+-+------+-+------+--+

In the above picture, the switch ports swp0-swp3 have eno3 as a DSA
master (connected to the internal swp5, a CPU port). The other internal
port, swp5, is configured as a DSA user port, so it has a net device.
Analogously, while eno3 is a DSA master and receives DSA-tagged traffic
(so it is useless for direct IP termination), eno2 receives DSA untagged
traffic and is therefore an IP termination endpoint into a switched
network.

What we do in this case is put tc-frer rules for stream replication and
recovery on swp4 itself, and we use eno2 as the convergence point for
locally terminated streams.

However, naturally, a hardware design that does not look like this can't
terminate traffic like this.

My idea was that it might be better if FRER was its own virtual network
interface (like a bridge), with multiple slave interfaces. The FRER net
device could keep its own database of streams and actions (completely
outside of tc) which would be managed similar to "bridge fdb add ...".
This way, the frer0 netdevice would be the local termination endpoint,
logically speaking.

What I don't know for sure is if a FRER netdevice is supposed to forward
frames which aren't in its list of streams (and if so, by which rules).
Because if a FRER netdevice is supposed to behave like a regular bridge
for non-streams, the implication is that the FRER logic should then be
integrated into the Linux bridge.

Also, this new FRER software model complicates the offloading on NXP
LS1028A, but let's leave that aside, since it shouldn't really be the
decisive factor on what should the software model look like.

Do you have any comments on this topic?