Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

From: Andi Kleen
Date: Thu Jul 26 2007 - 08:23:44 EST


> BTW, I really wonder how much pain could be avoided if updatedb recorded
> mtime of directories and checked it. I.e. instead of just doing blind
> find(1), walk the stored directory tree comparing timestamps with those
> in filesystem. If directory mtime has not changed, don't bother rereading
> it and just go for (stored) subdirectories. If it has changed - reread the
> sucker. If we have a match for stored subdirectory of changed directory,
> check inumber; if it doesn't match, consider the entire subtree as new
> one. AFAICS, that could eliminate quite a bit of IO...

That would just save reading the directories. Not sure
it helps that much. Much better would be actually if it didn't stat the
individual files (and force their dentries/inodes in). I bet it does that to
find out if they are directories or not. But in a modern system it could just
check the type in the dirent on file systems that support
that and not do a stat. Then you would get much less dentries/inodes.

Also I expect in general the new slub dcache freeing that is pending
will improve things a lot.

But even if updatedb was fixed to be more efficient we probably
still need a general solution for other tree walking programs
that cannot be optimized this way.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/