Re: Bandwidth Allocations under CFQ I/O Scheduler

From: Ric Wheeler
Date: Wed Oct 18 2006 - 07:24:52 EST



Helge Hafting wrote:

Jens Axboe wrote:

While that may make some sense internally, the exported interface would
never be workable like that. It needs to be simple, "give me foo kb/sec
with max latency bar for this file", with an access pattern or assumed
sequential io.

Nobody speaks of iops/sec except some silly benchmark programs. I know
that you are describing pseudo-iops, but it still doesn't make it more
clear.
Things aren't as simple

How about "give me 10% of total io capacity?" People understand
this, and the io scheduler can then guarantee this by ensuring
that the process gets 1 out of 10 io requests as long as it
keeps submitting enough.

The admin can then set a reasonable percentage depending on
the machine's capacity.

Helge Hafting

The tricky part is that when you mix up workloads, you blow the drive's ability to minimize head seek & rotational latency. For example, I have measured almost a 10x decrease when I mix one serious workload (reading each file in a large file system as fast as you can) with a moderate write workload.

All a long winded way of saying that what we might be able to do in the worst case is to give an even portion of that worst case IO capability which is itself only 10% of the best case (i.e., 1% of the non-shared best case) ;-)

Some of the high ends arrays (like the EMC Symmetrix, IBM Shark, Hitachi boxes, etc) are much better at this sharing since they have massive amounts of nonvolatile DRAM & lots of algorithmic ability to tease apart individual streams internally. Note that they have to do this since they are connected up to many different hosts.

It might be interesting to thinking about how we would tweak things for this specific class of arrays as a special case,

ric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/