Re: Quota file format proposal

Joerg Lenneis (lenneis@statrix2.wu-wien.ac.at)
Fri, 28 May 1999 16:49:09 +0200 (CEST)


Jan Kara:

>> >>> Isn't it possible to code the uid/gid in the struct too, this
>> >>> way we dont have to write a 1GB(?) file for 2 users, one at
>> >>> UID0 and one at UID 131072... Since this struct is read into
>> >>> memory structs, some small handeling shouldn't really matter
>> >>> in performance...
>>
>> >> We won't write 1GB of data. We take advantage of sparse files so
>> >> there is almost no difference between storage of UIDS 1000, 1010
>> >> and 1000, 10000...
>>
>> > Wouldn't it be a good idea at this point to switch to a more

[...]

>> > What about using a hash table instead?
>>
>> The problem with a hash table is that it's very much dependant on the
>> size of the table as to where a particular record gets placed in the
>> file, and also works best with a fairly large proportion of unused
>> entries.
> Yes. But usage ~50% isn't so bad... I'll write hash table and benchmark it
> at least :-).

It also depends on how exactly the hash table is implemented. One
could start out with a reasonable default size that is good for a
couple of thousand entries and then provide the tools to enlarge it as
needed or make a rewrite/enlargement happen automatically whenever
necessary.

>> I'd prefer to see a sorted file used for quotas, with the uid or gid
>> stored as part of the record structure in the file. If the records are
>> maintained in sorted order by uid or gid, then a binary search on the
>> file will quickly find any record therein, or confirm that a record
>> isn't present.
> <snip>
>> OK, it'll be slower than reserving an entry for every possible uid or
[...]

The "classic" approach for a large on-disk sorted list would be a B+
tree which would solve the insertion performance problem. On the other
hand, it might be a bit of overkill just to keep track of quotas.

> It won't be much slower that current implementation as it climbs tree. But
> this is hidden in ext2 implementation :-). I see the main problem in inserts
> and deletes. I agree they are not very often but spending 2 seconds on one
> insert seems too much to me :-). I'm searching the net for some interresting
> structures :-).

Here is one (requires a bit of modification, though):

ftp://koobera.math.uic.edu/www/cdb.html

> Honza.

-- 

Joerg Lenneis

University of Economics and Business Administration Department for Applied Statistics and Data Processing Augasse 2-6, 1090 Vienna, Austria

Tel. *43/1/31336 4758 email: lenneis@wu-wien.ac.at

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/