Re: RFC - how to balance Dirty+Writeback in the face of slow writeback.

From: Jens Axboe
Date: Fri Aug 25 2006 - 02:32:21 EST


On Fri, Aug 25 2006, Neil Brown wrote:
> On Monday August 21, axboe@xxxxxxx wrote:
> >
> > But these numbers are in no way tied to the hardware. It may be totally
> > reasonable to have 3GiB of dirty data on one system, and it may be
> > totally unreasonable to have 96MiB of dirty data on another. I've always
> > thought that assuming any kind of reliable throttling at the queue level
> > is broken and that the vm should handle this completely.
>
> I keep changing my mind about this. Sometimes I see it that way,
> sometimes it seems very sensible for throttling to happen at the
> device queue.
>
> Can I ask a question: Why do we have a 'nr_requests' maximum? Why
> not just allocate request structures whenever a request is made?
> If there some reason relating to making the block layer work more
> efficiently? or is it just because the VM requires it.

It's by and large because the vm requires it. Historically the limit was
there because the requests were statically allocated. Later the limit
help bound runtimes for the io scheduler, since the merge and sort
operations where O(N) each. Right now any of the io schedulers can
handle larger number of requests without breaking a sweat, but the vm
goes pretty nasty if you set (eg) 8192 requests as your limit.

The limit is also handy for avoiding filling memory with requests
structures. At some point here's little benefit to doing larger queues,
depending on the workload and hardware. 128 is usually a pretty fair
number, so...

> I'm beginning to think that the current scheme really works very well
> - except for a few 'bugs'(*).

It works ok, but it makes it hard to experiment with larger queue depths
when the vm falls apart :-). It's not a big deal, though, even if the
design isn't very nice - nr_requests is not a well defined entity. It
can be anywhere from 512b to megabyte(s) in size. So throttling on X
number of requests tends to be pretty vague and depends hugely on the
workload (random vs sequential IO).

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/