Re: IO scheduler, queue depth, nr_requests

From: Nick Piggin
Date: Wed Feb 18 2004 - 21:05:42 EST




Miquel van Smoorenburg wrote:

On Thu, 19 Feb 2004 02:24:31, Nick Piggin wrote:

Miquel van Smoorenburg wrote:


I found out what causes this. It's get_request_wait().

When the request queue is full, and a new request needs to be created,
__make_request() blocks in get_request_wait().

Another process wakes up first (pdflush / process submitting I/O itself /
xfsdatad / etc) and sends the next bio's to __make_request().
In the mean time some free requests have become available, and the bios
are merged into a new request. Those requests are submitted to the device.

Then, get_request_wait() returns but the bio is not mergeable anymore -
and that results in a backwards seek, severely limiting the I/O rate.


The "batching" logic there should allow a process to submit
a number of requests even above the nr_requests limit to
prevent this interleave and context switching.

Are you using tagged command queueing? What depth?


No, I'm not using tagged command queueing. The 3ware controller is not a
real scsi controller, the driver just emulates one. It's a raid5 controller
that drives SATA disks. It has an internal request queue ("can_queu")
of 254 outstanding commands.


This is what I mean by tagged command queueing.

Because that is way bigger than nr_requests
this happens - if I set nr_requests to 512, the problem goes away. But
that shouldn't happen ;)



What shouldn't happen?

I'm preparing a proof-of-concept patch now, if it works and I don't wedge
the remote machine I'm testing this on I'll post it in a few minutes.



I'm not very happy with forcing a process to sleep _after_ it
has submitted a request... but I'd be interested to see exactly
what your patch does.

By far the best option is to use appropriately sized queues.
The below patch is a start, but it unfortunately doesn't help
drivers which use private queueing implementations.

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.3/2.6.3-mm1/broken-out/scale-nr_requests.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/