Re: Ext2fs and hashed table.

Stephen C. Tweedie (sct@dcs.ed.ac.uk)
Wed, 4 Jun 1997 20:31:31 +0100


Hi,

On Wed, 04 Jun 1997 12:07:04 +0200, Jan Kasprzak
<kas@informatics.muni.cz> said:

> Stephen C. Tweedie pise:
> : smurf@work.smurf.noris.de (Matthias Urlichs) writes:
> : > ... the fact that the file's position on disk are stored as
> : > <block number,length> tuples (the first three are stored in the
> : > "inode", the rest in another big b-tree), not as a
> : > direct/indirect/whatever list of blocks. I'd like to see that
> : > (minus the b-tree) as an ext2fs extension,

> : It's already on the ext2fs TODO list. :)
> :
> : The main problem with this technique is that it completely falls apart
> : when you consider Unix semantics for files with holes

> I think there will be a performance penalty for random access
> to the file -- in the standard scheme after lseek() you exactly know
> the block pointing to the data (or the index of the appropriate indirect
> block). Thus you can access any byte in the file after at most
> 4 disk operations (inode and three levels of indirect blocks).
> In the new scheme you will have to walk through the direct entries
> and probably through the part of b-tree (which can be
> O(log (index_in_the_file)) operation).

> Am I wrong?

Yes. :)

The current scheme is already O(log(maxblk)). An extent mapped scheme
is be O(log(maxblk)) too, but its big advantage is that as long as
there is not much fragmentation, the overall data structure is much
smaller and so the number of indirections is typically less than for a
direct-mapped indirect index.

The extent map implementation would still use familiar looking
indirect blocks, but each entry in those blocks would be an (offset,
location) tuple rather than just a location entry. The only extra
cost incurred by this system would be the need to do a binary chop
search to find the correct offset at each level of the indirection
table, and on average you would hope that would be balanced against
the reduced number of indirections necessary.

This is not a new idea, by the way --- the VIVA filesystem, derived
from BSD FFS, used this technique a number of years ago. They found
that the overwhelming majority of files on a typical disk were all
mapped with either zero or one indirect blocks using extend mapping.
On Linux, any file over about 256K will have two levels of
indirection, and you get your first indirection block after 12k of
data. On a disk with very little fragmentation, an extent mapped file
should be able to grow quite large without any indirect blocks at all.

Cheers,
Stephen.