Is it just me, or is anyone else struck by the similarities between
the problems in the "Re: [RFC] File flags handling - proposal for API"
thread and the various rants by Hans Reiser etc about bundling many
files together?
The new API you are trying to specify gives a side channel to extra
information stored with each directory and file. You two (Alex and
Linus) are even scraping up against the idea of having an entire
namespace associated with the extra information attached to each file
or directory.
The difference between what you want for this side channel and what
Hans wants for his glob-of-files-is-a-file is a matter of size, access
speed, and name type and convention. You'd like this side channel to
be a fixed size, say, one or two 32-bit values, you'd like to get or
set them with a single system call, and your names are 32-bit
integers, with one or two well-known names for attributes associated
with every file in the filesystem.
There are things you might like to put in this side channel, but
can't, like access control lists or whatever, because the size isn't
right. And perhaps one of you could comment on the issues of
serializing changes to the file attributes against changes to the
file contents.
Other kinds of files present size and access speed problems. Larry
McVoy, among others, points out that one of the problems with
gazillions of .dot files is that they take lots of system calls to
read -- this is one of the problems that a user-space tarfs or podfuk
handles.
Perhaps what is needed isn't a new API for a side channel for every
file, but an API for quickly opening/reading/closing many very small
files. I don't claim to know the best way to do something like this,
but I could imagine something like
void dirv_read( const char *dirname, int npairs, size_t valspace,
const char **names, void **vals, void *valbuf)
where names is a list of "file" names, whose contents are concatenated
into valbuf, with pointers to the start of each value pointed to by
vals, kinda like the way argc and argv work for a C main() call.
Failures on particular files might be handled by returning a NULL
vals[i] entry, or perhaps an error value in place of a pointer. I
suppose we'd also need l and f isotopes, and a write isotope.
Here Linus presents the idea of making every directory act like a file:
Linus> I really think the thing should be something like just
Linus>
Linus> lookup("default");
Linus>
Linus> Remember: it would be implemented as if it was an implied
symlink,
Linus> so you can move them around and do whatever to them, it's just
Linus> that there would bea shorthand for opening a special name under
Linus> the directory.
Linus>
Linus> And it is ONLY a shorthand. Nothing more.
Now imagine we also do the converse: make every file act like a
directory. Don't write it off immediately -- don't tools like find()
know whether they are looking at a file or directory by examining the
attributes? The distinction is still accessable and maintainable.
If we then consider attributes of a file (like permission
bits) to be files in that directory, we can use the above API to read
or write the attributes. We'd need to establish a particular
attribute name in common across all filesystems to hold common
permission bits, and we'd need to establish particular attribute names
for each filesystem to hold the special attributes for that filesystem
(for instance, ACLs for NTFS). I can imagine libc can emulate the
above call on OS's that don't support it directly.
In short, the idea is to unify the concepts of files and directories,
and combine the filesystem namespace with the attribute namespace, so
that a single fast API can be used to access all the small bits of
info that go with a file, and so that the way to extend the file flags
is clear and portable.
Linus> (This is why I hate code that tries to be too generic and take
Linus> everything into account - it's not the code that is bad, but it's
Linus> the _implication_ of the code that I dislike).
Are there implications above that you dislike?
Earlier, Alex also expressed some misgivings about a general interface:
Albert> NTFS has a 4th time stamp.
Alex> Fsck, no. Not another OOB channel for whatever random
Alex> bullshit somebody wants to push through. Not another ioctl() -
Alex> we have enough mess already.
Albert> HFS has type and creator data.
Alex> No. Way. In. Hell. It's *not* a support for forks - if you
Alex> want Mac'n'dreck you know where to find it. Again, it's a *narrow*
Alex> channel - here I completely agree with Linus. The goal being to
Alex> reduce the bloody clutter, not to introduce one more dungheap a-la
Alex> ioctl().
Is it that you want fast access to a small bit of data stored with the
inode in ext2fs, or that you want to make it impossible for people to
write filesystems with extra attributes on the files without hacking
the kernel's API?
-Iain
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/