Re: "statsfs" API design

From: Paolo Bonzini
Date: Sun Nov 10 2019 - 15:58:34 EST


On 10/11/19 16:34, Alexey Dobriyan wrote:
> In the other direction: describe every field of /proc/*/stat file
> without looking to the manpage:
>
> $ cat /proc/self/stat
> 5349 (cat) R 5342 5349 5342 34826 5349 4210688 91 0 0 0 0 0 0 0 20 0 1 0 864988 9183232 184 18446744073709551615 94352028622848 94352028651936 140733810522864 0 0 0 0 0 0 0 0 0 17 5 0 0 0 0 0 94352030751824 94352030753376 94352060055552 140733810527527 140733810527547 140733810527547 140733810532335 0

That's why this is not what I am proposing, and also not what Greg has
mentioned.

> and realise that everything alse is a waste of electricity, namely,
>
> * pathname allocation (4KB)
> * VFS '/' split, lookups (/sys/kernel/.../" means 3+ lookups
> * 192 bytes for each dentry
> * 550+ bytes per inode
> * 3 system calls per act of gathering statistics
> userspace will be written in the most stupid way possible
> without openat() etc
> * userspace snprintf() for pathname
> * kernel space snprintf() somewhere
> * multiple copying inside kernel (vsnprintf.c)
> * general inability for userspace to estimate the amount of data in decimal
> (nobody does that), so nicely sized buffers of 4K or 1K or 16KB (bash)
> will be used which is a waste.

Yeah, all of this is true but I know how much I use
/sys/kernel/debug/kvm so backwards-compatibility with it is certainly a
requirement for stats. Good thing, having a high-level stats API lets
you also design something that targets different usecases than just
quick "cat" or "watch". The somewhat wasteful sysfs interface to
statsfs can even be hidden behind a kconfig symbol once there is an
alternative. It also makes it possible to create inodes on demand if
someone is so inclined.

So the good thing is that despite the disagreements, this can be
considered an argument in favor of statsfs, and we agree on that. :)

Thanks,

Paolo