Re: SCO: "thread creation is about a thousand times faster than on

From: Albert D. Cahalan (acahalan@cs.uml.edu)
Date: Mon Aug 28 2000 - 23:51:16 EST


Alexander Viro writes:
> On Mon, 28 Aug 2000, Albert D. Cahalan wrote:

>>> what, in the name of Cthulhu, makes us unclone() everything, no matter
>>> whether we need it or not?
>>
>> Setuid is a third special case. What happened to good taste? :-)
>
> Erm?
> * setuid is a fscking wart - THE mistake of dmr and/or ken.

If not urban legend, dmr was the patent owner.

> * close-on-exec originated as a covenience hack mostly to make
...
> * pseudo-roots are _WAY_ past the fscking wart level. Terminal

Yep, now why do you like behavior that interacts with
all three? This is some sort of S&M problem you have?

(likely: don't read linux-kernel after being awake for 2 days)

>>> E.g. would you really want to get an
>>> independent namespace upon exec()? Goodby mount(8)...
>>
>> The namespace bit is a backwards bit, so it can be kept.
>> This is a wart that must appear everywhere CLONE_* does.
>> The pre-existing UNIX cruft is something you have to live with.
>> You didn't make fork() unshare namespaces.
>
> fork() should _NOT_ unshare them. Otherwise - goodby mount.
> Same reasons why cd is a built-in.

No shit. "something you have to live with" You live with the
namespace flag being a special case everywhere, because it is
a backwards bit.

In case it isn't obvious yet, I'm NOT suggesting fork() change.
The comment was part of my argument for the namespace issue
being a reasonable exception to the unshare(~0) behavior.

>> BTW, mount(8) is going to need some serious hackery already,
>> just because admins will demand global (all namespace) mounting.
>
> Umm.. I'm afraid that you do not realize what you are talking
> about. Just think what mount(8) would do if it got a private
> namespace. Right, modify it. Then what? Right, exit(2).

Um, the point is you break mount(8) just by adding namespaces.
The /var/mail filesystem dies, so the admin unmounts it and
mounts a new disk... all the users, in login-enforced namespaces,
continue to see the corrupt filesystem.

So you need some hackery to make mount(8) cause change in all
namespaces at once. Whatever is done, this will be gross.
I suppose you'd require a loopback of some sort, so that one
might rip the real filesystem out from under /var/mail instead
of trying to unmount it in 50 different namespaces... ugh.

>> Under plan c, an exec should make the world look pretty much as
>> if the last clone was a fork(). This is great for the CGI example.
>>
>> Imagine documenting partial unclone. "... except that a CLONE_FILES
>> relationship is broken when there are close-on-exec file descriptors,
>> a CLONE_FS relationship is broken if loading the new executable
>> causes a personality change ... error may be occur iff a task tries
>> to execute privileged code (setuid, setgid, set-capability-bit, or
>> unreadable executable) while party to a thread construct involving
>> CLONE_FS, CLONE_FILES, CLONE_THREAD, CLONE_PTRACE, CLONE_PARENT, or
>> certain yet-to-be-defined future clone() flags ..."
>
> Nope. Much simpler. exec() replaces the VM with new one, built
> according to the binary you are trying to load. That's what it is for.
> Consequently, it unshares replaces the signal context (it is bound to VM).
> Use of some warts may force exec() to unshare other resources. Precisely,
> * personalities with pseudo-roots - CLONE_FS
> * close-on-exec - CLONE_FILES
> * suid - CLONE_FILES|CLONE_FS
> [whatever else is needed]
> Notice that regulare way to get something unshared is explicit call of
> unshare() before the exec(). Cases above are unfortunate results of the
> features that are inherently unsuitable for the clone()/rfork() model.

The explanation still sucks quite severely. The personality and
setuid problems are particulary gross, because they depend on
what executable is being started. Behavior changes drasticly when
one merely recompiles with an alternate personality.

If you simply unshare on exec, all this special-case crap goes away.
The "kill other threads" option also has this nice property, and the
"let user shoot foot" option only suffers from the setuid wart.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:23 EST