Re: Eureka! (was Re: UTF-8 and case-insensitivity)

From: Linus Torvalds
Date: Thu Feb 19 2004 - 15:32:26 EST




On Thu, 19 Feb 2004, Linus Torvalds wrote:
> >
> > What about dentry getting dropped in the middle of that loop _and_
> > another task setting the first bit again before the loop ends?
>
> Hey, you snipped the part where I said that the application has to have
> its own locking around the loop and around the lookup to avoid races.

[ That, btw, implies that we do need to make the "set bit one" a system
call of its own, so that somebody elses "lseek(fd, 0, SEEK_SET)" wouldn't
mess up. Mea culpa. ]

Anyway, if we're willing to make some other changes to the VFS layer, we
could make all of this a bit more efficient by _not_ requiring the actual
filesystem lookup to take place.

If we had a flag that allowed a dentry to not have a d_inode pointer, but
still _not_ be considered automatically negative, then we could just make
a loop that fills the dcache directly from the readdir() data inside the
kernel, without calling down to the filesystem to look up the inode.

That would save a _lot_ of memory - quite often we'd only need the dentry
itself.

That would require a third bit in the VFS dentry flags (something like
D_DENTRY_LIKELY_POSITIVE), and would require that "d_lookup()" not just
assume that a dentry without an inode is always negative (check the new
flag, and if so, do the filesystem lookup when the lookup actually
happens).

Doesn't look _too_ bad, and considering the potential memory savings (and
not having to seek around the disk to look up the inode data), it would
probably be worth thinking about at least as a "second stage".

So then we could have a dcache that is fully populated, even though the
actual inode data hasn't been loaded yet.

Comments?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/