Re: [1/2,v2] fdmap(2)
From: Alexey Dobriyan
Date: Thu Oct 19 2017 - 11:34:43 EST
On 10/18/17, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>> fdmap() is standalone thing.
>>
>> Next step is to design fdinfo(2) (?) which uses descriptors from
>> fdmap(2). Extending structures is done as usual: but version,
>> add new fields to the end.
>
> I very strongly disagree. If you really want a new interface for
> reading out information about other processes, design one that makes
> sense as a whole. Don't design it piecemeal. The last thing we need
> is a crappy /proc alternative that covers a small fraction of use
> cases.
Oh well.
> And demonstrate that it actually has a material benefit over
> fixing /proc.
> Meanwhile, why not just fix /proc?
/proc can not be fixed. Let me reiterate the arguments one more time.
Using /proc
* overallocates memory even if temporarily,
* instantiates dentries and inodes nobody cares about,
* make expansion path more painful than necessary:
1) adding field to the end of the file is the least risky move,
2) decimal radix and unfixed positioning are used often
3) read() doesn't accept any sort of filter for data
(1)+(3) = those who added their fields later eat the cost of all
previous fields even if their field is very simple and easy
to calculate.
(2)+(3) = lseek() can not be used as a substitute.
4) adding new file is cleaner however every virtual file creates
kernel objects which aren't interesting themselves.
This has direct costs (more syscalls and lookups) and
indirect costs (more garbage in d/icache and more flushing
when the process dies)
5) converting to strings eats cycles (and more cycles are consumed
if userspace decides to convert it back to binary for whatever
reasons)
For those who didn't pay attention, first patch to make integer
conversion faster made /proc/*/stat (?) 30% faster!
This is how slow text can be.
Sysfs is overall better but solely because it has strict one-value-per-file
rule, so interfaces are cleaner and there is less room for mistakes.
Philosophically, text files create false impression that parsing text
is easy.
As an example grab a bunch of students and ask them to write a program
which parses something innocent like /proc/*/stat for process state.
The beartrap is that ->comm is not escaped for Space and ')' so naive
strchr() from the beginning doesn't work reliably. The only reliable
way is to count spaces from the end(!) and pray kernel devs do not
extend the file with another field or, worse, with another textual
field.
ESCAPE_SPACE doesn't escape Space, funny, isn't it?
/proc/*/status kind of has the same problem however forward strstr()
works with "\nState:\t" because \n and \t will be escaped. But nobody
would search for just "State:", especially in scripts, right?
EIATF people often critique binary people for their mistakes
but themselves make equally stupid ones. "(unreachable)" anyone?.
So the answer it not to fix /proc, the answer it to leave /proc alone.
The answer is make Unix shell people move their lazy asses and
implement minimal type system and a way to execute raw system calls
like all normal programming languages do. They still haven't done it
after dozens of years and are arrogant enough to say "oh uh we can't
use cat and pipe to awk".