Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics
From: Paolo Bonzini
Date: Mon May 11 2020 - 13:34:44 EST
Hi Jonathan, I think the remaining sticky point is this one:
On 11/05/20 19:02, Jonathan Adams wrote:
> I think I'd characterize this slightly differently; we have a set of
> statistics which are essentially "in parallel":
>
> - a variety of statistics, N CPUs they're available for, or
> - a variety of statistics, N interfaces they're available for.
> - a variety of statistics, N kvm object they're available for.
>
> Recreating a parallel hierarchy of statistics any time we add/subtract
> a CPU or interface seems like a lot of overhead. Perhaps a better
> model would be some sort of "parameter enumn" (naming is hard;
> parameter set?), so when a CPU/network interface/etc is added you'd
> add its ID to the "CPUs" we know about, and at removal time you'd
> take it out; it would have an associated cbarg for the value getting
> callback.
>
>> Yep, the above "not create a dentry" flag would handle the case where
>> you sum things up in the kernel because the more fine grained counters
>> would be overwhelming.
>
> nodnod; or the callback could handle the sum itself.
In general for statsfs we took a more explicit approach where each
addend in a sum is a separate stats_fs_source. In this version of the
patches it's also a directory, but we'll take your feedback and add both
the ability to hide directories (first) and to list values (second).
So, in the cases of interfaces and KVM objects I would prefer to keep
each addend separate.
For CPUs that however would be pretty bad. Many subsystems might
accumulate stats percpu for performance reason, which would then be
exposed as the sum (usually). So yeah, native handling of percpu values
makes sense. I think it should fit naturally into the same custom
aggregation framework as hash table keys, we'll see if there's any devil
in the details.
Core kernel stats such as /proc/interrupts or /proc/stat are the
exception here, since individual per-CPU values can be vital for
debugging. For those, creating a source per stat, possibly on-the-fly
at hotplug/hot-unplug time because NR_CPUS can be huge, would still be
my preferred way to do it.
Thanks,
Paolo