Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics

From: Emanuele Giuseppe Esposito
Date: Tue May 05 2020 - 05:18:51 EST




On 5/4/20 11:37 PM, David Rientjes wrote:
On Mon, 4 May 2020, Emanuele Giuseppe Esposito wrote:


In this patch series I introduce statsfs, a synthetic ram-based virtual
filesystem that takes care of gathering and displaying statistics for the
Linux kernel subsystems.


This is exciting, we have been looking in the same area recently. Adding
Jonathan Adams <jwadams@xxxxxxxxxx>.

In your diffstat, one thing I notice that is omitted: an update to
Documentation/* :) Any chance of getting some proposed Documentation/
updates with structure of the fs, the per subsystem breakdown, and best
practices for managing the stats from the kernel level?

Yes, I will write some documentation. Thank you for the suggestion.


Values represent quantites that are gathered by the statsfs user. Examples
of values include the number of vm exits of a given kind, the amount of
memory used by some data structure, the length of the longest hash table
chain, or anything like that. Values are defined with the
statsfs_source_add_values function. Each value is defined by a struct
statsfs_value; the same statsfs_value can be added to many different
sources. A value can be considered "simple" if it fetches data from a
user-provided location, or "aggregate" if it groups all values in the
subordinates sources that include the same statsfs_value.


This seems like it could have a lot of overhead if we wanted to
periodically track the totality of subsystem stats as a form of telemetry
gathering from userspace. To collect telemetry for 1,000 different stats,
do we need to issue lseek()+read() syscalls for each of them individually
(or, worse, open()+read()+close())?

Any thoughts on how that can be optimized? A couple of ideas:

- an interface that allows gathering of all stats for a particular
interface through a single file that would likely be encoded in binary
and the responsibility of userspace to disseminate, or

- an interface that extends beyond this proposal and allows the reader to
specify which stats they are interested in collecting and then the
kernel will only provide these stats in a well formed structure and
also be binary encoded.

Are you thinking of another file, containing all the stats for the directory in binary format?

We've found that the one-file-per-stat method is pretty much a show
stopper from the performance view and we always must execute at least two
syscalls to obtain a single stat.

Since this is becoming a generic API (good!!), maybe we can discuss
possible ways to optimize gathering of stats in mass?

Sure, the idea of a binary format was considered from the beginning in [1], and it can be done either together with the current filesystem, or as a replacement via different mount options.

Thank you,
Emanuele

[1] https://lore.kernel.org/kvm/5d6cdcb1-d8ad-7ae6-7351-3544e2fa366d@xxxxxxxxxx/?fbclid=IwAR18LHJ0PBcXcDaLzILFhHsl3qpT3z2vlG60RnqgbpGYhDv7L43n0ZXJY8M



Signed-off-by: Emanuele Giuseppe Esposito <eesposit@xxxxxxxxxx>

v1->v2 remove unnecessary list_foreach_safe loops, fix wrong indentation,
change statsfs in stats_fs

Emanuele Giuseppe Esposito (5):
refcount, kref: add dec-and-test wrappers for rw_semaphores
stats_fs API: create, add and remove stats_fs sources and values
kunit: tests for stats_fs API
stats_fs fs: virtual fs to show stats to the end-user
kvm_main: replace debugfs with stats_fs

MAINTAINERS | 7 +
arch/arm64/kvm/Kconfig | 1 +
arch/arm64/kvm/guest.c | 2 +-
arch/mips/kvm/Kconfig | 1 +
arch/mips/kvm/mips.c | 2 +-
arch/powerpc/kvm/Kconfig | 1 +
arch/powerpc/kvm/book3s.c | 6 +-
arch/powerpc/kvm/booke.c | 8 +-
arch/s390/kvm/Kconfig | 1 +
arch/s390/kvm/kvm-s390.c | 16 +-
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/Kconfig | 1 +
arch/x86/kvm/Makefile | 2 +-
arch/x86/kvm/debugfs.c | 64 --
arch/x86/kvm/stats_fs.c | 56 ++
arch/x86/kvm/x86.c | 6 +-
fs/Kconfig | 12 +
fs/Makefile | 1 +
fs/stats_fs/Makefile | 6 +
fs/stats_fs/inode.c | 337 ++++++++++
fs/stats_fs/internal.h | 35 +
fs/stats_fs/stats_fs-tests.c | 1088 +++++++++++++++++++++++++++++++
fs/stats_fs/stats_fs.c | 773 ++++++++++++++++++++++
include/linux/kref.h | 11 +
include/linux/kvm_host.h | 39 +-
include/linux/refcount.h | 2 +
include/linux/stats_fs.h | 304 +++++++++
include/uapi/linux/magic.h | 1 +
lib/refcount.c | 32 +
tools/lib/api/fs/fs.c | 21 +
virt/kvm/arm/arm.c | 2 +-
virt/kvm/kvm_main.c | 314 ++-------
32 files changed, 2772 insertions(+), 382 deletions(-)
delete mode 100644 arch/x86/kvm/debugfs.c
create mode 100644 arch/x86/kvm/stats_fs.c
create mode 100644 fs/stats_fs/Makefile
create mode 100644 fs/stats_fs/inode.c
create mode 100644 fs/stats_fs/internal.h
create mode 100644 fs/stats_fs/stats_fs-tests.c
create mode 100644 fs/stats_fs/stats_fs.c
create mode 100644 include/linux/stats_fs.h

--
2.25.2