Re: [Ext2-devel] Re: [RFC, PATCH] Reservation based ext3preallocation

From: Mingming Cao
Date: Mon Apr 05 2004 - 11:44:53 EST


On Fri, 2004-04-02 at 18:50, Andrew Morton wrote:
> Mingming Cao <cmm@xxxxxxxxxx> wrote:
> >
> > On Fri, 2004-04-02 at 17:50, Andrew Morton wrote:
> > > hm, maybe. We should probably also provide a per-file ext3-specific ioctl
> > > to allow specialised apps to manipulate the reservation size.
> > >
> > > And we should grow the reservation size dynamically. I've suggested that
> > > we double its size each time it is exhausted, up to some limit. There may
> > > be better algorithms though.
> > You mean when the reservation window size is exhausted, right? I think
> > this is probably the easiest way. Maybe like the readahead window does.
> > Just sometimes the window reserved does not contains much free blocks to
> > allocate, and we could easily reach to the upper limit.
>
> Good point. So the reservation should be grown by "the number of blocks we
> allocated in the previous window", not by "the size of the previous
> window", yes?
>
Yes. Maybe in the reservation structure we add a counter to keep track
of the preallocation hit. Then when a new window need to be created, we
look at the old window preallocation hit ratio to determine how much the
window size should be next time.

> > Currently, when try to reserve a window in a block group, if there is no
> > window big enough for this, we skip this group and move on to the next
> > group. I was thinking maybe we should keep track of the largest
> > avaliable reservable window when we are looking for a new window, so in
> > case we can't find the one with expected size, we at least could get one
> > within the group.
>
> I suspect that if you cannot get a window in the blockgroup then simply
> skipping to the next blockgroup should be OK.
>
okey.

> But I don't understand why the reservation code needs to know about
> blockgroups at all, at least from a conceptual point of view.
>
Agree that reservation itself is a filesystem wide concept. The
reservation window could cross the block group boundary.

> Probably it's sufficient to use the inode's blockgroup's starting block as
> the initial target for allocations and then just forget about blockgroups.
> Simply let allocation wander further up the disk from there, with no
> further consideration of blockgroups.
I think the current code's logic is the same as you said. The logic of
current code is: given a goal block,try to allocate a block starting
from there within the inode's block group. If it failed, then simply
move on to next group without a goal -- the search for a free block will
start from the starting block of the next group. I was trying to keep
the same logic as before. So for the reservation code, given a goal
block, we will try to allocate a new reservation window (and then
allocate a block within it) from the give goal block. If it failed, we
will simply do reservation window allocate in the rest of the disk,
without consideration of the inode's blockgroup.

>
> It would be fairly weird for the entire disk to be covered by reservations,
> so falling back to the current algorithm would be OK.
okey.

> > > This work doesn't help us with the slowly-growing logfile or mailbox file
> > > problem. I guess that would require on-disk reservations, or a new
> > > `chattr' hint or such.
> >
> > Ted has suggested to preserve the reservation/preallocation for those
> > slowing growing logfile for mailbox file. Probably do not discard the
> > reservation window for those files(the logfile) when they are closed.
> > When it opens next time, it will allocate blocks directly from the old
> > reservation window. Is that what you think?
>
> yup, except we now have potentially millions of inodes which have active
> reservations. ENOSPC and CPU consumption problems are certain.
>
> Some combination of
>
> - A chattr hint
>
> - Using O_APPEND as a hint and
>
> - Retaining an upper limit on the number of unopened inodes which have a
> reservation
>
> should fix that up. You'd need to hook into ->destroy_inode to release
> reservations when inodes are reclaimed by the VM.
>
> But this is surely phase two material.
Okey. Will think about this more later...

Thanks for your help!

Mingming

>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Ext2-devel mailing list
> Ext2-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/ext2-devel
>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/