bio_add_page of higher order pages?

From: Lars Ellenberg
Date: Fri Jan 14 2011 - 06:25:47 EST



Someone wants to be able to add higher order pages to a bio_vec.
This is not my idea, but I was asked to comment on it,
but I'm not sure if this was allowed or even expected.
Somehow this question came up recently, and I have not been able to find
explicit documentation stating either way.

There is

/*
* was unsigned short, but we might as well be ready for > 64kB I/O pages
*/
struct bio_vec {
struct page *bv_page;
unsigned int bv_len;
unsigned int bv_offset;
};


and

/**
* bio_add_page - attempt to add page to bio
* @bio: destination bio
* @page: page to add
* @len: vec entry length
* @offset: vec entry offset
*
* Attempt to add a page to the bio_vec maplist. This can fail for a
* number of reasons, such as the bio being full or target block
* device limitations. The target block device must allow bio's
* smaller than PAGE_SIZE, so it is always possible to add a single
* page to an empty bio.
*/
int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
unsigned int offset)



Is it legal, or even expected, to be able to
page = alloc_pages(3);
bio_add_page(bio, page, 4*PAGE_SIZE+1536, 2*PAGE_SIZE + 512);

That is, can bv_len and bv_offset be >= PAGE_SIZE?

If so, that should be stated somewhere,
and I think various pieces would need fixing.
__blk_queue_bounce, the md raid5 stripe cache,
drbd (where is uses tcp_sendpage for each bio_vec),
probably more.

If not, (bv_offset + bv_len) <= PAGE_SIZE
should be clearly documented and enforced.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/