Re: [PATCH net-next 03/21] ethtool, stats: introduce standard XDP statistics

From: Saeed Mahameed
Date: Tue Aug 03 2021 - 19:57:29 EST


On Tue, 2021-08-03 at 13:49 -0700, Jakub Kicinski wrote:
> On Tue,  3 Aug 2021 18:36:23 +0200 Alexander Lobakin wrote:
> > Most of the driver-side XDP enabled drivers provide some statistics
> > on XDP programs runs and different actions taken (number of passes,
> > drops, redirects etc.).
>
> Could you please share the statistics to back that statement up?
> Having uAPI for XDP stats is pretty much making the recommendation
> that drivers should implement such stats. The recommendation from
> Alexei and others back in the day (IIRC) was that XDP programs should
> implement stats, not the drivers, to avoid duplication.
>

There are stats "mainly errors*" that are not even visible or reported
to the user prog, for that i had an idea in the past to attach an
exception_bpf_prog provided by the user, where driver/stack will report
errors to this special exception_prog.

> > Regarding that it's almost pretty the same across all the drivers
> > (which is obvious), we can implement some sort of "standardized"
> > statistics using Ethtool standard stats infra to eliminate a lot
> > of code and stringsets duplication, different approaches to count
> > these stats and so on.
>
> I'm not 100% sold on the fact that these should be ethtool stats.
> Why not rtnl_fill_statsinfo() stats? Current ethtool std stats are
> all pretty Ethernet specific, and all HW stats. Mixing HW and SW
> stats
> is what we're trying to get away from.
>

XDP is going to always be eBPF based ! why not just report such stats
to a special BPF_MAP ? BPF stack can collect the stats from the driver
and report them to this special MAP upon user request.

> > These new 12 fields provided by the standard XDP stats should cover
> > most, if not all, stats that might be interesting for collecting
> > and
> > tracking.
> > Note that most NIC drivers keep XDP statistics on a per-channel
> > basis, so this also introduces a new callback for getting a number
> > of channels which a driver will provide stats for. If it's not
> > implemented or returns 0, it means stats are global/device-wide.
>
> Per-channel stats via std ethtool stats are not a good idea. Per
> queue
> stats must be via the queue netlink interface we keep talking about
> for
> ever but which doesn't seem to materialize. When stats are reported
> via
> a different interface than objects they pertain to matching stats,
> objects and their lifetime becomes very murky.