Re: [PATCH 29/32] block, aio: Batch completion for bios/kiocbs
From: Jens Axboe
Date: Fri Jan 04 2013 - 04:23:41 EST
On 2012-12-27 03:00, Kent Overstreet wrote:
> When completing a kiocb, there's some fixed overhead from touching the
> kioctx's ring buffer the kiocb belongs to. Some newer high end block
> devices can complete multiple IOs per interrupt, much like many network
> interfaces have been for some time.
> This plumbs through infrastructure so we can take advantage of multiple
> completions at the interrupt level, and complete multiple kiocbs at the
> same time.
> Drivers have to be converted to take advantage of this, but it's a
> simple change and the next patches will convert a few drivers.
> To use it, an interrupt handler (or any code that completes bios or
> requests) declares and initializes a struct batch_complete:
> struct batch_complete batch;
> Then, instead of calling bio_endio(), it calls
> bio_endio_batch(bio, err, &batch). This just adds the bio to a list in
> the batch_complete.
> At the end, it calls
> This completes all the bios all at once, building up a list of kiocbs;
> then the list of kiocbs are completed all at once.
> Also, in order to batch up the kiocbs we have to add a different
> bio_endio function to struct bio, that takes a pointer to the
> batch_complete - this patch converts the dio code's bio_endio function.
> In order to avoid changing every bio_endio function in the kernel (there
> are many), we currently use a union and a flag to indicate what kind of
> bio endio function to call. This is admittedly a hack, but should
> suffice for now.
It is indeed a hack... Famous last words as well, I'm sure that'll stick
around forever if it goes in! Any ideas on how we can clean this up
Apart from that, I think the batching makes functional sense. For the
devices where we do get batches of completions (most of them), it's the
right thing to do. Would be nice it were better integrated though, not a
Is the rbtree really faster than a basic (l)list and a sort before
completing them? Would be simpler.
A few small comments below.
> +void bio_endio_batch(struct bio *bio, int error, struct batch_complete *batch)
> + if (error)
> + bio->bi_error = error;
> + if (batch)
> + bio_list_add(&batch->bio, bio);
> + else
> + __bio_endio(bio, batch);
Ugh, get rid of this 'batch' checking.
> +static inline void bio_endio(struct bio *bio, int error)
> + bio_endio_batch(bio, error, NULL);
Just make that __bio_endio().
Same thing exists on the rq side, iirc.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/