Re: CLONE_FILES problem.

Linus Torvalds (torvalds@cs.helsinki.fi)
Thu, 22 Aug 1996 08:38:45 +0300 (EET DST)


On Wed, 21 Aug 1996, Steven S. Dick wrote:
>
> Tom May <alfred!ucf-cs!netcom.com!ftom> wrote:
> >I recently wrote a device driver for NeXTSTEP in which a close() from
> >one thread forces an EBADF error on any pending reads in other
> >threads. This was ideal behaviour for the application I was
> >supporting, but I have no idea whether it would be a good thing in the
> >general case.

This is not the right thing to do, imho. And the reason it's not the right
thing is that it doesn't "fit" UNIX.

This is actually a case that UNIX already handles, albeit in another area.
The "unlink()" system call is more-or-less _exactly_ the same thing, except
that instead of removing the file descriptor (as in close) we remove the file
name.

In both cases we actually remove a "name", rather than the thing itself
(close() doesn't actually free up the in-kernel file descriptor: think of
multiple processes sharing the same open file after "fork()" or similar.
close() only removes a name mapping, the same way "unlink()" does, and then
when there are no mappings left the file descriptor is free'd).

But when UNIX "unlink()" is called, that doesn't mean that any open files
will return errors. The file name is gone, but open files continue to work
until you close them.

This case is actually pretty equivalent to "close()" and system calls that
use a fd (except we've moved "up" one level in the system: from "struct
inode" to "struct file"). Essentially, think of close() removing the "name"
from the "file descriptor namespace". Any subsequent operations that try to
use that name will fail, but operations that already have an active version
of the fd continue to use it.

I actually did that in 2.0.14 - it was trivial (and I've known about these
things with clone() for a long time), and I've done it for read() and
write(). Other operations still need to be handled too: but now I've done the
"groundwork" and it should be even more trivial to do.

> What if the programmer has a thread that does nothing but close unused
> files and open needed files.... and another thread that does nothing
> but handle read requests in a queue. It might mark a file as "finished"
> and the close thread would then close it, and open another file...
> The read thread might then notice and read again from the same fd,
> with the desire to get data from the new file.
>
> How do you propose the read thread should tell the kernel that it wants
> the new file instead of the old one? Open the file itself??
>
> If a thread is using a file while another thread closes it, it is a
> programming error. Nothing special should be done about this except
> that the kernel shouldn't fault because of a race condition in buggy code.

Not necessarily. We need to have a good set of semantics for everything: if
we don't have well-defined semantics then there is something basically wrong
with any model.

Non-UNIX people think it's a "programming error" to delete a file while it's
still in use. But in UNIX there are well-defined semantics for such things,
and it's not really a problem. In fact, some programs take advantage of the
fact that you can do things like that. Similarly, maybe somebody can come up
with a program that takes advantage of the fact that you can close file
descriptors while they are in use (and know that the close will be delayed
until the last usage, while being able to re-use the "namespace", ie the fd
number for something else).

Linus