Re: What /proc should contain [was: /proc/driver/microcode]

From: Ricky Beam (jfbeam@bluetopia.net)
Date: Thu Feb 24 2000 - 17:45:16 EST


On Thu, 24 Feb 2000, Peter T. Breuer wrote:
>> procfs is for process information (I'll be playing the part of purist today.)
>
>I disagree. (a) humans are processes. (b), what's in a name? /proc has
>come to mean "the place where the kernel outputs most of its info". Fine.

No, humans are people. :-) Humans are self adaptive where process aren't
(unless you've got some AI the rest of us don't.)

>> YES! Gee, let's give untrusted things access to /proc. There's all sorts
>
>Well, set their permissions appropriately then. But the fact is that
>the information has to be somewhere. How does being in proc in itself
>somehow make the information less secure than in a file somewhere else?

And if this untrusted thing has a buffer overflow or other flaw making it
possible for someone/thing to gain root access, then the permissions on
stuff in /proc is useless. The point is to give the thing access to
only what it needs without dragging a thousand other things along for
the ride. /dev is a good example... you can create a chroot()ed world
with only the nodes necessary for the thing in question without dragging
all of /dev with it.

But we can forget being able to pair out sections of procfs like this --
i.e. a procfs_chroot()? Linus wants all the sysctl() stuff to be a file
system tree. That's a reasonable thing to do, but I don't want it attached
to a static point in the file system making it difficult to cage processes.

At the core of this issue is the necessity of /proc for such everyday
ordinary things as "how many cpu's are in this thing?", for example.
There are far too many things pushed into /proc out of convenience without
regard to fast, tiny, efficient non-fs access to them. Also, people
quickly throw together a bunch of sprintf()s to hand back some status
or debuging information without spewing stuff via printk(). As a debug
tool, that's perfectly ok -- I do it myself. But the fact that people's
debugging tools become the sole point of inforamtion access is a bad thing.

>> NO. procfs was design for PROCESS INFORMATION. Over the years, people have
>
>Nonsense. Stop right there. Who cares what it was "developed for"?

The people who developed it. "We didn't design <X> to be a weapon" but
people certainly use it as one. Does that make it any more or less
right? Good? Bad?

As I said originally, I'd have to kill too many people to reverse this
evolution, and it would do little to it from happening again. All I can
hope to do is stop people from making it the sole interface for information.
And to stop them from dumping everything in sight as english text.

>Human beings were not developed to fly, but they do.

Actually, we don't... we make machines that fly and we go along for the
ride.

>> >But this is wrong ("evil"). They should't. Parsing is easy. Let them
>> >parse. Binary structures are for C.
>>
>> No it's not. That's one of the foundations of perl. I've seen (and even
>
>Stop. You don't seem to grok "parsing is easy". That simply tells me
>that you are one of the large number of programmers who can't write
>parsers, and therefore think it is difficult. Your argument is false
>from then on ...

Not at all. Just because YOU and I can write text parsers doesn't mean
the entire world can. There are hunreds of examples of text parsing gone
wrong throughout history. I don't care about the number of papers, books,
classes, or monkeys you've taught to do text parsing. There's a reason
lex and yacc exist and it's not because people are lazy. (They are, but
that's not the point.)

>> >Unfortuately for your argument, humans are the ones processing the
>> >information, ultimately.
>>
>> Wrong. Do some analysis of procfs accesses and you'll see most of the access
>
>Look, if I don't read the info, it's wasted. It might as well go to
>/dev/null. So I can assure you that the only important use is for human
>consumption. Proc's job is to facilitate my gathering of the information
>I need. It does that in the classical spirit of language design, by
>providing the information in convenient _readable_ formats. It's just a
>lot of ascii info. Just like configuration files. Are you arguning for
>a binary registry?

Let's take, for example, /proc/net/dev. You can 'cat' it and get all the
numbers and things you want/need right there. End of rant. However, the
instant you run 'netstat -i' you're wasting your time... netstat reads in
the text, converts it to binary numbers, and prints it back out as text
in a different format. You could argue netstat should skip the conversion
and play with the text from dev, but the point is it (and almost everything
else pulling data from /proc) DOES NOT DO THAT.

So, YES, a human will, in most cases, eventually, somewhere down the line,
see the information that came from /proc. However, if the application
accessing /proc does any work (e.g. math) on the data (such as printing
an IP address in deciaml dot-quad format) then there was no point of
originally presenting the application with human readable text.

>> Wrong. How often do you read /proc/net/dev? Granted, there are many things
>
>Pretty often. You don't seem to get that I read it every time I use a
>tool that accesses it. I even read it without tools (except that some
>of the formats are not very human readable, so I prefer not to ...).

See above.

>> Unless you intend on using 'cat' to view everything, my arguement stands.
>> (/proc/pci is about the only thing routinely 'cat'ed. However, there is
>
>You don't get the argument. Language design. You can write your code in
>assembler, or in C, or in Python. /proc is there to provide the
>interface that allows us to read the data with the minimum or

'us' is the program reading from /proc, not the human setting at the console.
...

>effort appropriate to a high-level interfaces: humans and
>their scripts or minor programs. It's there to make life easy for US.

The PROGRAM is there to make it easy for 'us' to understand what the
computer understands. By passing human centric information between
computer programs, you're creating extra (unnecessary) work.

This arguement would then become one of "doing in kernel space what belongs
in user space." The kernel's job is not to interact with humans. Humans
interact with programs which interact with, among other things, the kernel.
(When's the last time you keyed in a SCSI "START STOP UNIT" command to
 eject a disk?)

>> >> Programs are forced to parse english text to do anything. (kernel
>> >
>> >And that's dead easy. What's your beef?
>>
>> Easy != Efficient. And it's easier to read a number as a number and not have
>
>Why should I care if it takes 10ms or 20ms to cat a file from proc?

(dammit) That is the most damnable statement anyone in the computer science
world can say. It's attitudes like this that will require a 500GHz processor
to run "hello world" in a few years.

    "The processor is fast enough; we don't need to make it efficient."

As I said before, if that is your attitude then kindly step away from the
keyboard and never (ever) touch kernel code again. (already, so toned it
down this time.)

>> to hunt for it in the text and then convert it from text to binary. THAT is
>> my beef.
>
>??? No it isn't. This is called parsing. It's easy.

And everytime someone changes it ever so slightly -- almost unnoticed by a
human -- the parser fails to process it. If you can transfer the adaptive
parsing capabiities of a human into a C/C++/perl/sed/whatever application,
then be my guest -- just let me know how many hard drives I'll need to
hold it. (I know some image processing people who would make a solid gold
statue of you if you could.)

>> Oh good god! Every programmer should be forced to program for an embeded
>> system for a day. Maybe then people would stop doing such stupid things.
>
>Embedded systems don't, by definition, have humans running them. And I
>presume you in particular have no idea at all if your embedded system
>is working correctly :-(.

You'd be surprised. Granted, most of them don't have a human on a keyboard
standing directly over them -- there's a master control station where the
person sets with the keyboard to montitor >1 systems. And yes, I do know
if my embedded systems are doing their job(s)... and I do it without spewing
volumes of human readable text.

>> The human brain can parse text very effectively. Computers don't! Having
>
>Computers are faster parsers. Perhaps you'd like the speed figures for
>parsing a 10000 line fortran program on a 386sx20? I'd estimate between
>10 and 30s. Beat that! (if you like, I'll measure something of the
>sort, or dig out old figures of mine from about that era .. I seem to
>recall it taking a few seconds at a time on the hp9000's)

Did I say anything about speed? NO. Computers do exactly what you tell
them to do in the manner in which you've instucted them to do so. That is
all. If they haven't been programmed to handle a slight change of their
input, then they fail. You can completely reformat the input and a human
would adapt and continue (unless they happen to be mindless data entry
workers -- In that case, they stop to bitch about the format change for a
few hours. <grin>)

>> taught computer science for a few years, I can tell you that's not
>> "convenient" for programmers. Most of the errors students make are in
>
>Then in this case you happen to be a mistaken teacher. Doesn't matter.

No, just stupid students. (they weren't CS students, fwiw.)

>I repeat ... parsing is easy. I'll add "fast", "efficient", too. Because
>you don't seem to know that doesn't make it untrue! Check the
>literature (some of it mine! Try looking for my stuff in Software
>Practice and Experience about 95, and in TOPLAS too ...).

You know how to parse text; I know how to parse text; I cannot say the same
about the rest of the world. (And based on my former teaching experience,
I'm taking the safe route and saying they cannot.)

And adding "fast" and/or "efficient" means your code is highly specific to
it's input. I've written a number of data file analysis tools. Their parsing
of text is _extremely_ fast but completely intolerant of changes to the
input.

>Ifconfig doesn't fail. netstat might. mount still works. Modutils still
>work (apart from lsmod).

- ifconfig by itself will not work -- it cannot find any devices to lookup.
- netstat will fail as it reformats data from /proc/net/*.
- mount consults /proc/filesystems for a list to try.
- rmmod (-a) looks at /proc/modules to see if the module is loaded (and to
  find the modules to purge.) lsmod just puts a header on /proc/modules.
- free will not work as it reformats /proc/meminfo
- sysconf(_SC_NPROCESSORS_ONLN) will fail as it counts /proc/cpuinfo.
- some X servers will fail as they look in /proc/pci (or /proc/bus/pci)

I'll stop there.

>lmsensors won't work, by definition (In case you are wondering, yes, I am
>typing this mail with /proc unmounted ;-).

I didn't say the machine would crash :-)

>> It's not the individual driver's resposibility to keep up with it. What you
>
>I agree. It looks hard to tell the inner code which process is calling
>it, but I don't think it has to be told. I'm almost sure that it's all
>taken care of by the ordinary VFS layers. Yes. It is so. The only thing
>that is not there is that there is no buffer in the interface between
>VFS and the registered proc_info (or whatever) functions. Thus the
>proc codes have to recalculate their data every time they are asked for
>a bit of it, because they don't know how else to get to the wanted bit.

And generally speaking, they shouldn't have to. The case could be argued that
drivers with a specified size should be able to seek to any point in their
data, but there are problems with that. (The entire notion of file sizes
in /proc is a grey area anyway.)

>> This is 100% the fault of procfs. Code needs to be placed in the open()
>> and close() functions for handling read()s without calling the driver's
>> function over and over again. The driver should then allocate sufficient
>
>I sort of agree. Is it open and close that are the markers
>you need? You need to maintain "process x has read up to here". I.e.
>you need to maintain a page of output until all its current readers

I would say "reader_s_"... there will only be one fd attached to this
reading instance. Ok, maybe there'd be dup()s of it? How do duplicated
fd's get handled?

>OK. I'll look at it.

While this is certainly "the right way"... now may not be the best time to
go break procfs... softnet has made a big enough mess to cleanup. Altering
the handling of procfs will potentially break _alot_ of code.

I'll let Linus have the final word on this... this is gonna be an ISDN
sized patch <grin>

--Ricky

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:10 EST