Re: Elastic Quota File System (EQFS)

From: Olaf Dabrunz
Date: Thu Jun 24 2004 - 06:53:07 EST


On 24-Jun-04, Amit Gud wrote:
> Ok, this is what I propose:
>
> Lets say there are just 2 users with 100 megs of individual quota, user
> A is using 20 megs and user B is running out of his quota. Now what B could
> do is delete some files himself and make some free space for storing other
> files. Now what I say is instead of deleting the files, he declares those
> files as elastic.
>
> Now, moment he makes that files elastic, that much amount of space is added
> to his quota. Here Mark Cooke's equation applies with some modifications:
> N no. of users,
> Qi allocated quota of ith user
> Ui individual disk usage of ith user ( should be <= allocated quota of ith
> user ),
> D disk threshold; thats the amount of disk space admin wants to allow the
> users to use (should be >= sum of all users' allocated quota, i.e. summation
> Qi ; for i = 0 to N - 1).
>
> Total usage of all the users (here A & B) should be at _anytime_ less than
> D. i.e. summation Ui <= D; for i = 0 to N - 1.
>
> The point to note here is that we are not bothering how much quota has been
> allocated to an individual user by the admin, but we are more interested in
> the usage pattern followed by the users. E.g. if user B wants additional
> space of say 25 megs, he picks up 25 megs of his files and 'marks' them
> elastic. Now his quota is increased to 125 megs and he can now add more 25
> megs of files; at the same time allocated quota for user A is left
> unaffected. Applying the above equation total usage now is A: 20 megs, B:
> 125 megs, now total 145 <= D, say 200 megs. Thus this should be ok for the
> system, since the usage is within bounds.
>
> Now what happens if Ui > D? This can happen when user A tries to recliam his
> space. i.e. if user A adds say more 70 megs of files, so the total usage is
> now - A: 90 megs, B: 125 megs; 215 ! <= D. The moment the total usage
> crosses the value, 'action' will be taken on the elastic files. Here elastic
> files are of user B so only those will be affected and users A's data will
> be untouched, so in a way this will be completely transparent to user A.
> What action should be taken can be specified by the user while making the
> files elastic. He can either opt to delete the file, compress it or move it
> to some place (backup) where he know he has write access. The corresponding

- having files disappear at the discretion of the filesystem seems to be
bad behaviour: either I need this file, then I do not want it to just
disappear, or I do not need it, and then I can delete it myself.

Since my idea of which files I need and which I do not need changes
over time, I believe it is far better that I can control which files I
need and which I do not need whenever other constraints (e.g. quota
filled up) make this decision necessary. Also, then I can opt to try
to convince someone to increase my quota.

- moving the file to some other place (backup) does not seem to be a
viable option:

- If the backup media is always accessible, then why can't the user
store the "elastic" files there immediately?
-> advantages:
- the user knows where his file is
- applications that remember the path to a file will be able to
access it

- If the backup media will only be accessible after manually inserting
it into some drive, this amounts to sending an E-Mail to the backup
admin and then pass a list of backup files to the backup software.

But now getting the file back involves a considerable amount of
manual and administrative work. And it involves bugging the backup
admin, who now becomes the bottleneck of your EQFS.

So this narrows down to the effective handling of backup procedures and
the effective administration of fixed quotas and centralization of data.

If you have many users it is also likely that there are more people
interested in big data-files. So you need to help these people organize
themselves e.g. by helping them to create mailing-list, web-pages or
letting them install servers that makes the data centrally available
with some interface that they can use to select parts of the data.

I would rather suggest that if the file does not fit within a given
quota, the user should apply for more quota and give reasons for that.

I believe that flexible or "elastic" allocation of ressources is a good
idea in general, but it only works if you have cheap and easy ways to
control both allocation and deallocation. So in the case of CBQ in
networks this works, since bandwidth can easily and quickly be allocated
and deallocated.

But for filesystem space this requires something like a "slower (= less
expensive), bigger, always accessible" third level of storage in the
"RAM, disk, ..." hierarchy. And then you would need an easy or even
transparent way to access files on this third level storage. And you
need to make sure that, although you obviously *need* the data for
something, you still can afford to increase retrieval times by several
orders of magnitude at the discretion of the filesystem.

But usually all this can be done by scripts as well.

Still, there is a scenario and a combination of features for such a
filesystem that IMHO would make it useful:

- Provide allocation of overquota as you described it.
- Let the filesystem move (parts of) the "elastic" files to some
third-level backing-store on an as-needed basis. This provides you
with a not-so-cheap (but cheaper than manual handling) resource
management facility.

Now you can use the third-level storage as a backing store for
hard-drive space, analoguous to what swap-space provides for RAM. And
you can "swap in" parts of files from there and cache them on the hard
drive. So "elastic" files are actually files that are "swappable" to
backing store.

This assumes that the "elastic" files meet the requirements for a
"working set" in a similar fashion as for RAM-based data. I.e. the swap
operations need only be invoked relatively seldom.

If this is not the case, your site/customer needs to consider buying
more hard drive space (and maybe also RAM).


The tradeoff for the user now is:
- do not have the big file(s) OR
- have them and be able to use them in a random-access fashion from
any application, but maybe only with a (quite) slow access time,
but without additional administrative/manual hassle

Maybe this is a good tradeoff for a significant amount of users. Maybe
there are sites/customers that have the required backing store (or would
consider buying into this). I do not know. Find a sponsor, do some field
research and give it a try.

> action will be taken until the threshold is met.
>
> Will this work?? We are relying on the 'free' space ( i.e. D - Ui ) for the
> users to benefit. The chances of having a greater value for D - Ui increases
> with the increase in the number of users, i.e. N. Here we are talking about
> 2 users but think of 10000+ users where all the users will probably never
> use up _all_ the allocated disk space. This user behavior can be well
> exploited.
>
> EQFS can be best fitted in the mail servers. Here e.g. I make whole
> linux-kernel mailing list elastic. As long as Ui <= D I get to keep all the
> messages, whenever Ui > D, messages with latest dates will be 'acted' upon.
>
> For variable quota needs, admin can allocate different quotas for different
> users, but this can get tiresome when N is large. With EQFS, he can allocate
> fixed quota for each user ( old and new ) , set up a value for D and relax.
> The users will automatically get the quota they need. One may ask that this
> can be done by just setting up value of D, checking it against summation Ui
> and not allocating individual quotas at all. But when summation Ui crosses D
> value, whose file to act on? Moreover with both individual quotas and D, we
> give users 'controlled' flexibility just like elastic - it can be stretched
> but not beyond a certain range.
>
> What happens when an user tries to eat up all the free ( D - Ui ) space?
> This answer is implementation dependent because you need to make a decision:
> should an user be allowed to make a file elastic when Ui == D . I think by
> saying 'yes' we eliminate some users' mischief of eating up all free space.
>
> Queries, comments, suggestions welcome.
>
> regs,
> AG
>
> > On Wed, 2004-06-23 at 18:53, Mark Watts wrote:
> > > > Greetings,
> > > >
> > > > I think I should discuss this in the list...
> > > >
> > > > Recently I'm into developing an Elastic Quota File System (EQFS). This
> > > > file system works on a simple concept ... give it to others if you're
> not
> > > > using it, let others use it, but on the guarantee that you get it back
> when
> > > > you need it!!
> > >
> > > How do you intend to guarantee this?
> > > Randomly deleting a users files to free up disk space is a Bad (tm)
> idea, so
> > > what other mechanism are you going to employ?
> >
> > Hi Mark, Amit,
> >
> > Simple example of a flexible quota scheme:
> >
> > N users with Q megabytes of guaranteed quota
> > D total megs of disk storage
> > The difference D - N*Q is the amount you can be flexible with.
> >
> >
> > The above is a somewhat different scheme than the 'give your unused
> > quota back to others' part of Amit's post though.
> >
> > If Amit does actually mean to have a situation where the remaining
> > guaranteed quota is less than the actual remaining free space, there is
> > *no way* to satisfy the guarantee.
> >
> > Imagine the worst case scenario if all users suddenly want their
> > guaranteed quota. The only way to free up space is deleting files from
> > over-quota users - something which would be unacceptable operationally,
> > IMHO.
> >
> >
> > That said, I'll read your tech description with interest when it comes
> > out,
> >
> > Mark
> >
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Olaf Dabrunz (od/odabrunz), SUSE Linux AG, NÃrnberg

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/