procfs and privacy and a few other questions

From: Wolfgang Draxinger
Date: Tue Jul 18 2006 - 18:18:45 EST

Today I friend and me got a bit curious about the default permissions
of inodes in the procfs. By default they allow everybody to read the
procfs of foreign processes. Some people might see this as a lack of
privacy, since everybody logged in into the system can see the
command line of their processes, indicating the tasks they're doing
or worse, disclose actual data.

The question was if there were a way to prevent other people see the
processes (or at least their command line arguments) with "ps aux" et
al. The other question is of course if such a prevention would make
sense at all. I argued, that at least those foreign processes should
be visible that are parents of the process attempting to gather that

A quick "chmod -R og-rx /proc/*" showed, that this does the job for
the currently running processes, but I doubt, that this is a sane

Since there is no umask mount option for procfs there must be a
sensefull reason for the "everybody may read others proc data"


Another question, also on procfs is, where exactly does the race
condition of the recently reported privilege escalation exploit does
take place. In the few days I was experimenting a bit with the code,
trying to inject non a.out formats, and eventually the Mono framework
or the WINE wrapper could allow to inject code through a BINFMT_MISC
handler, but I'm not through on this. Gladly the thing is fixed, but
given the fact that there seem to be still a lot of servers on which
even the sys_prctl exploit isn't fixed yet I'm a bit precarious about
the whole situation.


Next question: I'm currently working on a binary format that is
explicitly designed for the use in modularized, object oriented
programs, kinda bit like CLI/.Net/Mono. The big difference is, that
is focuses on normal "unmanaged" code. To use this format I had to
write a wrapper - of course - that loads the modules, dereferences
symbols and so on, you know the drill. However the module systems
allows for extension of the loader mechanism, through a second level
loader that is used for the major work in the whole system. However
this second level loader must be bootstrapped by the wrapper upon
program start - just like the Linux kernel gets bootstrapped. This
results in 2 disjunct versions of the loader code being present in
the running binary, of which the initial loader only takes away
resources. My idea was, that the initial loader could somehow
construct a fully functional binary in memory and then somehow
replace its process code with the in memory binary. I wonder if there
is such a possibility with the Linux kernel. Best thing I found so
far was fexecve.


And last but not least: I wrote a little, no tiny is the better word,
device driver for some gamma scintillation counter ADC hardware (GKH
may remember me, writing him an email on the stable driver ABI issue,
in which I mentioned this driver). That driver is simpler than SKULL
in "Linux Device Drivers" and it has proven stable so far (20
computers running 24/7 and putting data through the guts of the
driver with a rate of up to 100k counts per second (each count
triggers the IRQ and the IMHO a bit crude locking scheme). So far the
systems are running stable and required no reboot since March
(they're not connected to the internet and no foreign users have
access to them, so no need to patch security issues).

Yet I'm concern that it might break something or I made something
deliberately wrong, heck the last time I wrote kernel code was back
in 2.4.1 times, I micht be a bit out of practice.

Is there the chance, that somebody might check the code for proper
Linux coding style, not (just) in the way of source formatting but
the splitting up of the source files, the use of include files (e.g.
I put all IOCTL constants in a separate file with no source
dependencies and can be included from both Linux kernel code and user
space code without requiring includes "from the other world") and of
course the way I'm doing things. I know that the locking scheme is
very crude, but I'm quite sure, that the worst that can happen (if
the spinlock mechanism is bugfree) is, that one sample might be read
to small if a read from userspace is preempted by IRQ, and this read
will increment exactly this sample; the probability for this is very
small though: (2^12)^-2. However the samples are a histogram, and on
the next readout (about 1/50sek later) the correct value will be
there again. Accepting this little glitch saved me from implementing
a lot of infrastructure (workqueues, dynamic ringbuffers - there
might come in up to 100k IRQs per second, each only delivering one
single data word to be fetched with inw). I'm just a bit uneasy on
the whole thing.


Phew, I think that's it, I hope, that I'm not disturbing the kernel
guys too much from their great work and that they might find a few
moments to awnser me. Though I'm subscribed to the daily LKML digest
I'd be gladfull if awnsers could also be CCed to my PM.

Thanks, and happy coding

Wolfgang Draxinger

Attachment: pgp00000.pgp
Description: PGP signature