Re: What is the best way to provide FDB related metrics to user space?

From: Nikolay Aleksandrov
Date: Mon Mar 27 2023 - 06:52:14 EST


On 24/03/2023 16:43, Vladimir Oltean wrote:
> Hi Oleksij,
>
> On Fri, Mar 24, 2023 at 03:06:22PM +0100, Oleksij Rempel wrote:
>> Hello all,
>>
>> I am currently working on implementing an interface to provide
>> FDB-related metrics to user space, such as the size of the FDB, the
>> count of objects, and so on. The IEEE 802.1Q-2018 standard offers some
>> guidance on this topic. For instance, section "17.2.4 Structure of the
>> IEEE8021-Q-BRIDGE-MIB" defines the ieee8021QBridgeFdbDynamicCount
>> object, and section "12.7.1.1.3 Outputs" provides additional outputs
>> that can be utilized for proper bridge management.
>>
>> I've noticed that some DSA drivers implement devlink raw access to the
>> FDB. I am wondering if it would be acceptable to provide a generic
>> interface for all DSA switches for these kinds of metrics. What would be
>> the best interface to use for this purpose - devlink, sysfs, or
>> something else?
>
> It's not an easy question. It probably depends on what exactly you need
> it for.
>
> At a first glance, I'd say that the bridge's netlink interface should
> probably report these, based on information collected and aggregated
> from its bridge ports. But it becomes quite complicated to aggregate
> info from switchdev and non-switchdev (Wi-Fi, plain Ethernet) ports into
> a single meaningful number. Also, the software bridge does not have a
> hard limit per se when it comes to the number of FDB entries (although
> maybe it wouldn't be such a bad idea).
>

I've had such patch lying around for a very long time. I can polish and upstream
it if there is interest, I think I dropped it because I wanted to do also per-port
limits for dynamic entries which are much harder to get right and higher prio
tasks took over at the time. I could revisit if there is interest.

> ieee8021QBridgeFdbDynamicCount seems defined as "The current number of
> dynamic entries in this Filtering Database." So we're already outside
> the territory of statically defined "maximums" and we're now talking
> about the degree of occupancy of certain tables. That will be a lot
> harder for the software bridge to aggregate coherently, and it can't
> just count its own dynamic FDB entries. Things like dynamic address
> learning of FDB entries learned on foreign interfaces would make that
> utilization figure quite imprecise. Also, some DSA switches have a
> VLAN-unaware FDB, and if the bridge is VLAN-aware, it will have one FDB
> entry per each VLAN, whereas the hardware table will have a single FDB
> entry. Also, DSA in general does not attempt to sync the software FDB
> with the hardware FDB.
>

Agreed, it's hard to sync the hw/sw fdb.

> So, while we could in theory make the bridge forward this information
> from drivers to user space in a unified form, it seems that the device
> specific information is hard to convert in a lossless form to generic
> information.
>

If it can be made consistent somehow and the counters are generic enough for
everyone to use and export, and it can work with multiple bridges and so on,
that's not so bad.

> Which is exactly the reason why we have what we have now, I guess.
>
> What do you mean by "devlink raw access"? In Documentation/networking/dsa/dsa.rst
> we say:
>
> | - Resources: a monitoring feature which enables users to see the degree of
> | utilization of certain hardware tables in the device, such as FDB, VLAN, etc.
>
> If you search for dsa_devlink_resource_register(), you'll see the
> current state of things. What is reported there as device-specific
> resources seems to be the kind of thing you would be interested in.