Re: [PATCH net-next v2 4/4] net: hns3: support skb's frag page recycling based on page pool

From: Jesper Dangaard Brouer
Date: Wed Sep 08 2021 - 12:47:24 EST




On 08/09/2021 17.57, Jakub Kicinski wrote:
On Wed, 8 Sep 2021 18:26:35 +0300 Ilias Apalodimas wrote:
Normally I'd say put the stats in ethtool -S and the rest in debugfs
but I'm not sure if exposing pages_state_hold_cnt and
pages_state_release_cnt directly. Those are short counters, and will
very likely wrap. They are primarily meaningful for calculating
page_pool_inflight(). Given this I think their semantics may be too
confusing for an average ethtool -S user.

Putting all the information in debugfs seems like a better idea.

I can't really disagree on the aforementioned stats being confusing.
However at some point we'll want to add more useful page_pool stats (e.g the
percentage of the page/page fragments that are hitting the recycling path).
Would it still be 'ok' to have info split across ethtool and debugfs?

Possibly. We'll also see what Alex L comes up with for XDP stats. Maybe
we can arrive at a netlink API for standard things (broken record).

You said percentage - even tho I personally don't like it - there is a
small precedent of ethtool -S containing non-counter information (IOW
not monotonically increasing event counters), e.g. some vendors rammed
PCI link quality in there. So if all else fails ethtool -S should be
fine.

I agree with Ilias, that we ought-to add some page_pool stats.
*BUT* ONLY if this doesn't hurt performance!!!

We have explained before, how this is possible, e.g. by keeping consumer vs. producer counters on separate cache-lines (internally in page_pool struct and likely on per CPU for returning pages). Then the drivers ethtool functions can request the page_pool to fillout a driver provided stats area, such that the collection and aggregation of counters are not on the fast-path.

I definitely don't want to see pages_state_hold_cnt and pages_state_release_cnt being exposed directly. These were carefully designed to not hurt performance. An inflight counter can be deducted by above ethtool-driver step and presented to userspace.


Notice that while developing page_pool, I've been using tracepoints and bpftrace scripts to inspect the behavior and internals of page_pool.
See[1] and I've even written a page leak detector[2].

In principle you could write a bpftrace tool that extract stats, the same way. But I would only recommend doing this for devel phase, because these tracepoints do add some overhead.
Originally I wanted to push people to use this for stats, but I've realized that not having these stats easy available is annoying ;-)

-Jesper

[1] https://github.com/xdp-project/xdp-project/tree/master/areas/mem/bpftrace
[2] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/bpftrace/page_pool_track_leaks02.bt