Re: "statsfs" API design

From: Greg KH
Date: Sun Nov 10 2019 - 05:14:25 EST


On Sun, Nov 10, 2019 at 05:09:13AM -0500, Brian Masney wrote:
> On Sun, Nov 10, 2019 at 10:14:35AM +0100, Greg KH wrote:
> > On Sat, Nov 09, 2019 at 09:44:41PM +0300, Alexey Dobriyan wrote:
> > > > statsfs is a proposal for a new Linux kernel synthetic filesystem,
> > > > to be mounted in /sys/kernel/stats
> > >
> > > I think /proc experiment teaches pretty convincingly that dressing
> > > things into a filesystem can be done but ultimately is a stupid idea.
> > > It adds so much overhead for small-to-medium systems.
> > >
> > > > The first user of statsfs would be KVM, which is currently exposing
> > > > its stats in debugfs
> > >
> > > > Google has KVM patches to gather statistics in a binary format
> > >
> > > Which is a right thing to do.
> >
> > It's always "simpler" to just take binary data and suck it in. That
> > works for a year or so until another value needs to be supported. Or
> > removed. Or features are backported.
> >
> > The reason text values in individual files work is they are "self
> > describable" and "self discoverable". You "know" what the value is and
> > that it is supported because the file is there or not. With binary
> > values in a single file you do not know any of that.
> >
> > So you need some way of describing the data to userspace in order for
> > this to work properly over the next 20+ years.
> >
> > Maybe something like varlink which describes the data coming from the
> > kernel in an easy-to-handle format? Or something else, but just using
> > blobs does not work over the long-term, sorry.
>
> What about using a text format like YAML? Here's some benefits:
>
> - The fields are self describing based on the key name.
> - New fields can be easily added without breaking compatibility.
> - Allows for a script to easily parse the contents while keeping
> human readability.
> - Would work for systems that run busybox as their userspace without
> having to install additional tools.
> - Allows for a nested data structure.

varlink was created to solve the issues that people have had with YAML
over time, so you might want to look into that :)
https://varlink.org/

> The downside is that the output would be larger than a binary interface
> but it's more maintainable in my opinion.

binary interfaces are unmaintainable over time, especially when you do
not control both sides of the interface (unlike Google and their use of
this for KVM stats.)

thanks,

greg k-h