Re: New TRIM/UNMAP tree published (2009-05-02)

From: James Bottomley
Date: Sun May 03 2009 - 15:48:06 EST


On Sun, 2009-05-03 at 15:20 -0400, Jeff Garzik wrote:
> [tangent...]
>
> Does make you wonder if a ->init_rq_fn() would be helpful, one that
> could perform gfp_t allocations rather than GFP_ATOMIC? The idea being
> to call ->init_rq_fn() almost immediately after creation of struct
> request, by the struct request creator.

Isn't that what the current prep_fn actually is?

> I obviously have not thought in depth about this, but it does seem that
> init_rq_fn(), called earlier in struct request lifetime, could eliminate
> the need for ->prepare_flush, ->prepare_discard, and perhaps could be a
> better place for some of the ->prep_rq_fn logic.

It's hard to see how ... prep_rq_fn is already called pretty early ...
almost as soon as the elevator has decided to spit out the request

> The creator of struct request generally has more freedom to sleep, and
> it seems logical to give low-level drivers a "fill in LLD-specific info"
> hook BEFORE the request is ever added to a request_queue.

Unfortunately it's not really possible to find a sleeping context in
there: The elevators have to operate from the current
elv_next_request() context, which, in most drivers can either be user or
interrupt.

The way the block layer is designed is to pull allocations up the stack
much closer to the process (usually at the bio creation point) because
that allows the elevators to operate even in memory starved conditions.
If we pushed the allocation down into the request level, we'd need some
type of threading (bad for performance) and the request processing would
stall when some GFP_KERNEL allocation went out to lunch finding memory.

The ideal for REQ_TYPE_DISCARD seems to be to force a page allocation
tied to a bio when it's issued at the top. That way everyone has enough
memory when it comes down the stack (both extents and WRITE SAME sector
will fit into a page ... although only just for WRITE SAME on 4k
sectors).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/