Re: file as a directory

From: Scott Young
Date: Tue Nov 30 2004 - 21:48:17 EST


On Tue, 30 Nov 2004 19:39:15 +0100 (MET), Jan Engelhardt
<jengelh@xxxxxxxxxxxxxxx> wrote:
> >My suggestion is to add a framework, an infrastructure, in the VFS
> >wherein a simple plugin can be written to poke into the file as if it
> >were a directory. So with that framework in place, I can write a
> >plugin for archive support (treating the .tar files as directories),
> >Peter could write a plugin for poking into /etc/passwd (treating it as
> >a directory), and Jon Doe could write a plugin for sendmail.cf

The biggest problem I see with adding the complicated stuff to VFS is
the bloat and risk to system stability. However, some things cannot
be done in userspace, such as good caching. How is one userspace
library supposed to keep a transparent cache of, for example, an index
for a tar file, not clutter up the on-disk representation of the
cache, effectively manage space utilization, and be able to
efficiently detect changes to files in order to invalidate the cache?
This would become orders of magnitude easier if a ubiquitous
filesystem interface were in use. However, the only ubiquitous
filesystem interface is VFS, which shouldn't have to take all the code
bloat.

Maybe something crazy could work. Let's take some concepts from the
Aspect Oriented Programming paradigm. Whenever a program is loaded
into memory, calls in the program to the vfs interface are modified to
instead call new userspace functions that have all of the desired
functionality, and those userspace functions eventually call the real
system functions. The kernel wouldn't have to take the bloat, plus it
would be able to do things the userspace libraries wouldn't be able to
do efficiently. It's the best of both worlds, with a little insanity
thrown in (It'd be neat to see the loader bootstrap its own code to
weave in the caching of the pre-woven binaries).


> That's something I could live with, but how do you want to tag a file being
> "tar" so that tar_ops is used instead of the "default file" ops?
>
> You could not do so without an extra function, and once you use that extra
> function to tag a certain file being "tar" -- you know that extensions are
> kinda "worthless", and, especially, unrealiable -- you could also have used tar
> -tvf.
>
> Did I mention tar is not the perfect format? It's because it is lacking an
> index and letting the kernel wade through a GB-sized tar file just to perform
> and readdir (yet imagine reading the last file of it) would be a hell of
> skipping. Keeping a non-persistent index in memory may solve the problem, but
> hey, I also do not want to spend too much memory just for a single tar file.

It would also be nice to have an interface which can build, maintain,
and cache on the disk a persistent index into a tar file on the disk,
and then be able to delete this index when space is running low.
Plus, this index could be generated by streaming the file through
memory, so you don't need to consume too much memory for a single
file.


> >struct file_operations ops = {
> > .read = tar_readdir,
> > .readdir = tar_readdir,
> > ......
> >};
> >
> >register_file_type("tar", &ops);
>
> Jan Engelhardt
> --
> ENOSPC
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/