On Mon, 28 Aug 2000, Albert D. Cahalan wrote:
> > what, in the name of Cthulhu, makes us unclone() everything, no matter
> > whether we need it or not?
>
> Setuid is a third special case. What happened to good taste? :-)
Erm?
* setuid is a fscking wart - THE mistake of dmr and/or ken.
There were better ways to do that and credit where credit is due, they
_did_ go for better ways later. Unfortunately, we are stuck with the crap.
setuid program is _never_ a good thing and all w*nking around credentials
only spreads the mess thinner, but doesn't clean it up.
* close-on-exec originated as a covenience hack mostly to make
life with setuid sh*t somewhat easier. It is both redundant and
insufficiently strong - e.g. dup() doesn't copy c-o-e bit. Just look at
the interface - fcntl() and ioctl(), aka. dirty hack instead of clean
design.
* pseudo-roots are _WAY_ past the fscking wart level. Terminal
stage of syphilis would be more like that - almost braindead, truely
grisly sight, all innards are rotten, the thing barely limps around and
just plain stinks. Albert, I've dealt with that code quite recently and
trust me, it's a living horror. It's a bastard kludge from hell.
> > E.g. would you really want to get an
> > independent namespace upon exec()? Goodby mount(8)...
>
> The namespace bit is a backwards bit, so it can be kept.
> This is a wart that must appear everywhere CLONE_* does.
> The pre-existing UNIX cruft is something you have to live with.
> You didn't make fork() unshare namespaces.
fork() should _NOT_ unshare them. Otherwise - goodby mount. Same reasons
why cd is a built-in.
> BTW, mount(8) is going to need some serious hackery already,
> just because admins will demand global (all namespace) mounting.
Umm.. I'm afraid that you do not realize what you are talking about. Just
think what mount(8) would do if it got a private namespace. Right, modify
it. Then what? Right, exit(2). And what kind of impact would it have on
the namespace of shell and friends? Exactly, zero. mount(8) may need
hackery, but it's not going to be in that area. The whole point of that
utility is to modify the namespace of parent (and everyone who shares it
with the parent). I.e. it must share the namespace with parent.
> Under plan c, an exec should make the world look pretty much as
> if the last clone was a fork(). This is great for the CGI example.
>
> Imagine documenting partial unclone. "... except that a CLONE_FILES
> relationship is broken when there are close-on-exec file descriptors,
> a CLONE_FS relationship is broken if loading the new executable
> causes a personality change ... error may be occur iff a task tries
> to execute privileged code (setuid, setgid, set-capability-bit, or
> unreadable executable) while party to a thread construct involving
> CLONE_FS, CLONE_FILES, CLONE_THREAD, CLONE_PTRACE, CLONE_PARENT, or
> certain yet-to-be-defined future clone() flags ..."
Nope. Much simpler. exec() replaces the VM with new one, built
according to the binary you are trying to load. That's what it is for.
Consequently, it unshares replaces the signal context (it is bound to VM).
Use of some warts may force exec() to unshare other resources. Precisely,
* personalities with pseudo-roots - CLONE_FS
* close-on-exec - CLONE_FILES
* suid - CLONE_FILES|CLONE_FS
[whatever else is needed]
Notice that regulare way to get something unshared is explicit call of
unshare() before the exec(). Cases above are unfortunate results of the
features that are inherently unsuitable for the clone()/rfork() model.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:23 EST