Re: ext3 allocate-with-reservation latencies

From: Mingming Cao
Date: Tue Apr 05 2005 - 01:11:51 EST


On Tue, 2005-04-05 at 06:13 +0200, Ingo Molnar wrote:
> * Lee Revell <rlrevell@xxxxxxxxxxx> wrote:
>
> > I can trigger latencies up to ~1.1 ms with a CVS checkout. It looks
> > like inside ext3_try_to_allocate_with_rsv, we spend a long time in this
> > loop:
> >

We have not modify the reservation create algorithm for a long time.
Sorry, I missed the background here, on which kernel did you see this
latency? And what tests you were running?

> > ext3_test_allocatable (bitmap_search_next_usable_block)
> > find_next_zero_bit (bitmap_search_next_usable_block)
> > find_next_zero_bit (bitmap_search_next_usable_block)
> >
> > ext3_test_allocatable (bitmap_search_next_usable_block)
> > find_next_zero_bit (bitmap_search_next_usable_block)
> > find_next_zero_bit (bitmap_search_next_usable_block)
>

The loop in bitmap_search_next_usable_block tries to find the next zero
bit on the bitmap, if it find, then it need to check whether this zero
bit is also be zero on the journalling last copy the the bitmap. The
double check is there prevent re-use the just freed block on disk before
it is committed from journalling. If it's not marked as a free block on
the journalling copy, then it will find the next free bit on that copy
first, then loop back etc.etc.

The bitmap_search_next_usable_block() code is quite simple:

static int
bitmap_search_next_usable_block(int start, struct buffer_head *bh,
int maxblocks)
{
int next;
struct journal_head *jh = bh2jh(bh);

/*
* The bitmap search --- search forward alternately through the actual
* bitmap and the last-committed copy until we find a bit free in
* both
*/
while (start < maxblocks) {
next = ext3_find_next_zero_bit(bh->b_data, maxblocks, start);
if (next >= maxblocks)
return -1;
if (ext3_test_allocatable(next, bh))
return next;
jbd_lock_bh_state(bh);
if (jh->b_committed_data)
start = ext3_find_next_zero_bit(jh->b_committed_data,
maxblocks, next);
jbd_unlock_bh_state(bh);
}
return -1;
}

The latency report shows the pattern of two calls of ext3_find_next_zero_bit () after a call to ext3_test_allocatable(), matches what I suspect above.


I wonder if re-run the test with -0 noreservation mount option make any difference?

> Breaking the lock is not really possible at that point, and it doesnt
> look too easy to make that path preemptable either. (To make it
> preemptable rsv_lock would need to become a semaphore (this could be
> fine, as it's only used when a new reservation window is created).)
>
> The hard part is the seqlock - the read side is performance-critical,
> maybe it could be solved via a preemptable but still scalable seqlock
> variant that uses a semaphore for the write side? It all depends on what
> the scalability impact of using a semaphore for the new-window code
> would be.
>
The seqlock was there to prevent a smp race, but actually turns out to be unnecessary, and it has been removed recently. While even if it is there, it will not cause much latency here: the whole ext3_try_to_allocate_with_rsv is guard by ei->truncate_sem, thus no multiple readers/writers at the same time.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/