Re: [PATCH v2 14/14] net: ethernet: mtk_eth_soc: support creating mac address based offload entries

From: Felix Fietkau
Date: Tue Apr 12 2022 - 04:41:32 EST



On 11.04.22 15:00, Andrew Lunn wrote:
On Thu, Apr 07, 2022 at 08:21:43PM +0200, Felix Fietkau wrote:

On 07.04.22 20:10, Andrew Lunn wrote:
> On Tue, Apr 05, 2022 at 09:57:55PM +0200, Felix Fietkau wrote:
> > This will be used to implement a limited form of bridge offloading.
> > Since the hardware does not support flow table entries with just source
> > and destination MAC address, the driver has to emulate it.
> > > > The hardware automatically creates entries entries for incoming flows, even
> > when they are bridged instead of routed, and reports when packets for these
> > flows have reached the minimum PPS rate for offloading.
> > > > After this happens, we look up the L2 flow offload entry based on the MAC
> > header and fill in the output routing information in the flow table.
> > The dynamically created per-flow entries are automatically removed when
> > either the hardware flowtable entry expires, is replaced, or if the offload
> > rule they belong to is removed
> > > +
> > + if (found)
> > + goto out;
> > +
> > + eh = eth_hdr(skb);
> > + ether_addr_copy(key.dest_mac, eh->h_dest);
> > + ether_addr_copy(key.src_mac, eh->h_source);
> > + tag = skb->data - 2;
> > + key.vlan = 0;
> > + switch (skb->protocol) {
> > +#if IS_ENABLED(CONFIG_NET_DSA)
> > + case htons(ETH_P_XDSA):
> > + if (!netdev_uses_dsa(skb->dev) ||
> > + skb->dev->dsa_ptr->tag_ops->proto != DSA_TAG_PROTO_MTK)
> > + goto out;
> > +
> > + tag += 4;
> > + if (get_unaligned_be16(tag) != ETH_P_8021Q)
> > + break;
> > +
> > + fallthrough;
> > +#endif
> > + case htons(ETH_P_8021Q):
> > + key.vlan = get_unaligned_be16(tag + 2) & VLAN_VID_MASK;
> > + break;
> > + default:
> > + break;
> > + }
> > I'm trying to understand the architecture here.
> > We have an Ethernet interface and a Wireless interface. The slow path
> is that frames ingress from one of these interfaces, Linux decides
> what to do with them, either L2 or L3, and they then egress probably
> out the other interface.
> > The hardware will look at the frames and try to spot flows? It will
> then report any it finds. You can then add an offload, telling it for
> a flow it needs to perform L2 or L3 processing, and egress out a
> specific port? Linux then no longer sees the frame, the hardware
> handles it, until the flow times out?
Yes, the hw handles it until either the flow times out, or the corresponding
offload entry is removed.

For OpenWrt I also wrote a daemon that uses tc classifier BPF to accelerate
the software bridge and create hardware offload entries as well via hardware
TC flower rules: https://github.com/nbd168/bridger
It works in combination with these changes.

What about the bridge? In Linux, it is the software bridge which
controls all this at L2, and it should be offloading the flows, via
switchdev. The egress port you derive here is from the software bridge
FDB?
My code uses netlink to fetch and monitor the bridge configuration, including fdb, port state, vlans, etc. and it uses that for the offload path - no extra configuration needed.

> So i'm wondering what is going on here. So is this a frame which has
> ingressed, either from the WiFi, or another switch port, gone to the
> software bridge, bridges to a DSA slave interface, the DSA tagger has
> added a tag and now it is in the master interface? Can you accelerate
> such frames? What is adding the DSA tag on the fast path? And in the
> opposite direction, frames which egress the switch which have a DSA
> tag and are heading to the WiFi, what is removing the tag? Does the
> accelerator also understand the tag and know what to do with it?WiFi ->
> Ethernet is not supported by MT7622, but will be added for newer

SoCs like MT7986. The PPE supports both parsing and inserting MT7530
compatible DSA tags. For Ethernet->WiFi flows, the PPE will also add
required metadata that is parsed by the MT7915 WiFi Firmware in order to
figure out what vif/station the packets were meant for.

O.K. What about IGMP and multicast? Does the accelerate match on IGMP
and forwards it to the CPU, rather than follow the flow rules? Can you
set multiple egress destinations for multicast so that it can go both
to the switch and the host, when the host has a local interest in the
traffic?
IGMP/multicast isn't handled yet at the moment. I still need to do some research on what can be offloaded and how. The offload only handles unicast and everything else is going through the CPU.

- Felix