Re: [RFC] File flags handling - proposal for API.

Wolfgang Walter (wolfgang.walter@stusta.mhn.de)
Sat, 26 Jun 1999 19:08:40 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Dan Kegel: "Apache performance: Run queue proportional to number of connections?"
Previous message: Giuliano Pochini: "bh->b_count"

On Fri, Jun 25, 1999 at 06:55:03PM -0700, Linus Torvalds wrote:
>
>
> On Sat, 26 Jun 1999, Wolfgang Walter wrote:
> >
> > - how to avoid that applications not prepared for that do not use
> >
> > fp = open(<directory>, O_RDONLY)
> >
> > If this silently opens the 'default' file, they may do not fail but
> > instead do something stupid.
>
> But that's exactly what happens now. Have you ever done a
>
> grep xxxxx *
>
> in a directory that has subdirectories? It will open the directory
> (because right now we allow it to happen even without O_DIRECTORY), but
> then when it reads it will get an error and will print out
>
> grep: yyyy: Is a directory
>
> which is obviously just not what the user meant.

Agreed. But an expected behaviour :-).

>
> So what is to say that what the user _meant_ was not to open the "default"
> file for that directory?
>
> No, I'm not 100% convinced it is a good idea. But I _am_ conviced that if
> people want to test it out, they should just do so, and then when we have
> a better feel for whether it makes any real sense or not can we judge
> better.
>
> For example, I would _like_ to have something like this:
>
> PATH=~/bin:$PATH
>
> and then in my bin directory I'd have my own binaries, some of them with
> their own configuration files. For example, it could look like this:
>
> ~/bin/netscape
> default -> /usr/local/bin/netscape
> cache/
> config
> cookies/
>
> and when somebody executes ~/bin/netscape (which is a directory with a
> shorthand - the shorthand just happens to be _another_ shorthand, namely a
> symlink), they actually end up _executing_ /usr/local/bin/netscape through
> the shorthands, and netscape could be taught to look at argv[0] as the
> location for it's own config files.
>
> So then I could have two versions of netscape, each with its own
> configuration and cache, and there would be no clashes due to both trying
> to populate my home directory. More importantly, I wouldn't ever even have
> to _see_ the files, except if I went out searching.
>
> Think of it as the concept of dot-files, but taken to a higher level:
> dot-files are there to hide the details, but quite frankly they don't do
> all that good a job of it, and then when you occasionally want to look at
> one of them and do the normal
>
> ls -ld .??*
>
> you end up in shock. Wouldn't it be much nicer to just do
>
> ls ~/bin/netscape/
>
> (note the ending slash) and magically just get all the config and other
> files associated with THAT particular netscape configuration.
>
> Again, maybe it's horrible. But It's intriguing enough that I think it's
> worth at least trying.
>
> > - This is more general: how do I know that I have to use O_DIRECTORY?
> > Doing lstat(), then open() and then fstat() is not atomic.
> > Following situation:
> >
> > - lstat() says it is a file
> > - file ist moved and replaced by directory
> > - I use open() without O_DIRECTORY, because I think its a file
> > - now I silently open the default-file
>
> So what's the problem? You expected to open a file, and you opened a file.
> that can happen today. The fact that somebody renamed something from under
> you meant that you got a different file than the one you thought you'd
> get, but that is exactly the behaviour you have today.

To be true, we have here the situation that several persons (no root-privileges)
have to manage together a commen directory-tree. It was hard and for some
actions impossible to write tools which manipulate this tree in a secure way.
Secure in the sense: root does not trust these people nor trust they each
other. So this seems to be a little bit paranoid, the reason for this is
that the number of these peoples is rather high, changes rather often and
they are no unix-experts at all which means that we have to take into account
there accounts may be misused by other people.

In the moment there seems to be only one really secure working solution - this
is using a server-process (there are others, but these are rather inefficient
and complicated).

This is the reason I'm relativly sensitiv to these problems.

The default-file-mechanism makes it again harder. Because it is more difficult
to write tools which change the ownership of files and directories.

But I completely agree that the default-mechanism may be very useful in a lot
of situations.

And I have the strong feeling that the whole fchmod(), fchown(), ... is not
correct solution for the problem "a name changes its object". In a lot of (most)
common and important situations these functions can not be used to ensure
a name does not change its object (especially you can't ensure that a name
is still not used for an object if you want to create one).

Example:

- creation of names in a non-trusted directory - a directory can not
be trusted if any of its parents can be manipulated by other processes (but
root). /tmp is only the obvious example. But more subtile is if I share
a directory with another person.

- web-servers for example: if a web-server allow users to export their own
directories it is rather hard to ensure they will not export /etc/passwd or
other config-files using symlinks.

- nfs-server

- samba

Short: always if you want to check a path component for component that you
are not fooled by a symlink.

Short: I see that the default-mechanism does make it harder but not really.

And again: I wished we would have a general mechanism for a process to tell
the kernel: for these names: do not accept any filesystem-operation if the
resolution of the names have changed (that means it identifies another object
or id identifies one when it was unused).

If I want to check /a/b/c/d (maybe that is does not point back to /etc/passwd)
you may do

name_lock("/a","/a/b","/a/b/c","/a/b/c/d");

e = lstat("/a");
if (e != 0 && errno == ENEWRESOLUTION) {
fprintf("Aja: someone changed entry /a: it now names another
object\n");
return 1;
}
/* what you want zu check for /a */
e = lstat("/a/b");
if (e != 0 && errno == ENEWRESOLUTION) {
fprintf("Aja: someone changed entry /a/b: it now names another
object\n");
return 1;
}
...

name_unlock("/a", "/a/b", "/a/b/c");

Or a nfs-daemon could - instead of keeping a file with path p open - just use
name_lock the name p for it. If it opens p it again this fails with
ENEWRESOLUTION if someone other removed this file or renamed another file so it
now has the name p.

You can use it to detect if a mountpoint changed.

You can rename files in untrusted directories, because you can check all
path-components, source and target-name and then do the rename.

It is a rather general approach and should also eliminate almost all reasons
why one want to have sometimes an open by inode (which is a bad idea, I think).

It allows a process to protect itself against races when changing the namespace
or accesing objects by name, but does not break unix semantics how a filesystem
behaves.

These name_lock/unlock would be a sort of weak partial namespace snapshot.
It does not inhibit that this namespace may be manipulated but it allows you
to protect your own manipulation and interpretation of the namespace your
interested in: you can say: do it only if its still what it was when I called
name_lock() for it. It is of course no transaction (there is no rollback), but
a little bit similar to a optimistic lock.

Don't know if it is difficult to implement in the linux-kernel.

The more we go the way to say: the name should be used as identifier of an
object and the namespace is involved if it can be accessed at all (permissions
of directories), we need some control of the namespace.

Excuse that I left the subject and talked about something relativly unrelated.

Wolfgang Walter

-- Veni, Vidi, VISA: I came, I saw, I did a little shopping.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Dan Kegel: "Apache performance: Run queue proportional to number of connections?"
Previous message: Giuliano Pochini: "bh->b_count"