Re: Quota file format proposal

Riley Williams (rhw@MemAlpha.CX)
Thu, 27 May 1999 21:59:49 +0100 (GMT)


Hi Joerg, Jan.

>>> Isn't it possible to code the uid/gid in the struct too, this
>>> way we dont have to write a 1GB(?) file for 2 users, one at
>>> UID0 and one at UID 131072... Since this struct is read into
>>> memory structs, some small handeling shouldn't really matter
>>> in performance...

>> We won't write 1GB of data. We take advantage of sparse files so
>> there is almost no difference between storage of UIDS 1000, 1010
>> and 1000, 10000...

> Wouldn't it be a good idea at this point to switch to a more
> flexible format for the quota file? You are correct with the
> above remarks as long as userids stay 2 bytes long. For 4 byte
> userids we are above the limit for the current maximum file size
> of ext2 with respect to the quota file.

SInce there are already systems for which the current 15-bit limit on
uid's is too small, any new quota file format will need to allow for
31-bit uid's at least, as the kernel will need to change to using
larger uid's soon anyway.

> What about using a hash table instead?

The problem with a hash table is that it's very much dependant on the
size of the table as to where a particular record gets placed in the
file, and also works best with a fairly large proportion of unused
entries.

I'd prefer to see a sorted file used for quotas, with the uid or gid
stored as part of the record structure in the file. If the records are
maintained in sorted order by uid or gid, then a binary search on the
file will quickly find any record therein, or confirm that a record
isn't present. Even with a million records in the file, a maximum of
20 records would need to be fetched to determine whether a given uid
or gid was present, and a maximum of 30 records would locate an entry
in such a file with 1,000,000,000 records.

OK, it'll be slower than reserving an entry for every possible uid or
gid and thus being able to go straight to the relevant record, but on
the other hand, it can take up considerably less disc space since only
the records for uid's or gid's with access to that partition need be
stored in its quota file. It's not even much of a slowdown either.

The three events concerned with such a file are:

1. Inserting a new record. This requires the file to be rewritten
such that the new record is inserted in the correct place, so
would be a lengthy operation in all cases where the new record
did not become the new last record in the file.

2. Reading or modifying an existing record. This would entail a
search to locate the record in the file, followed by either
reading or writing the record in question. It would therefore
be a reasonably fast operation.

3. Deleting a record from the file. This could be done in two ways,
each with different implications:

a. An otherwise meaningless value could be stored in one field
of the relevant record, and the presence of that value would
be taken by the quota code as meaning "deleted record". This
would result in this being a reasonably fast operation.

b. The whole file could be rewritten to exclude the deleted
record. This would result in this being a slow operation.

Use of option (a) could also speed up insertion of records, since
a new record could overwrite a deleted record that occurred at
the same place in the file where the new record needed to be.

In addition, the kernel's existing cache system would be likely to
speed up searches anyway.

Comments?

Best wishes from Riley.

+----------------------------------------------------------------------+
| There is something frustrating about the quality and speed of Linux |
| development, ie., the quality is too high and the speed is too high, |
| in other words, I can implement this XXXX feature, but I bet someone |
| else has already done so and is just about to release their patch. |
+----------------------------------------------------------------------+
* ftp://ftp.MemAlpha.cx/pub/rhw/Linux
* http://www.MemAlpha.cx/kernel.versions.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/