Re: [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio: Change CFQ to use CFS like queue time stamps)

From: Corrado Zoccolo
Date: Thu Nov 12 2009 - 03:53:33 EST

Next message: Stefan Schmidt: "[PATCH] fs/jbd: Export log_start_commit to fix ext3 build."
Previous message: Justin P. Mattock: "Re: [PATCH 3/3][RFC] tracing: Separate out x86 time stamp readingand ns conversion"
In reply to: Vivek Goyal: "Re: [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio:Change CFQ to use CFS like queue time stamps)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Nov 10, 2009 at 8:15 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Tue, Nov 10, 2009 at 07:05:19PM +0100, Corrado Zoccolo wrote:
>> On Tue, Nov 10, 2009 at 3:12 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
>> >
>> > Ok, I ran some simple tests on my NCQ SSD. I had pulled the Jen's branch
>> > few days back and it has your patches in it.
>> >
>> > I am running three direct sequential readers or prio 0, 4 and 7
>> > respectively using fio for 10 seconds and then monitoring who got how
>> > much job done.
>> >
>> > Following is my fio job file
>> >
>> > ****************************************************************
>> > [global]
>> > ioengine=sync
>> > runtime=10
>> > size=1G
>> > rw=read
>> > directory=/mnt/sdc/fio/
>> > direct=1
>> > bs=4K
>> > exec_prerun="echo 3 > /proc/sys/vm/drop_caches"
>> >
>> > [seqread0]
>> > prio=0
>> >
>> > [seqread4]
>> > prio=4
>> >
>> > [seqread7]
>> > prio=7
>> > ************************************************************************
>>
>> Can you try without direct and bs?
>>
>
> Ok, here are the results without direct and bs. So it is now buffered
> reads. The fio file above remains more or less same except that I had
> to change size to 2G as with-in 10 seconds some process can finish reading
> 1G and get out of contention.
>
> First Run
> =========
> read : io=382MB, bw=39,112KB/s, iops=9,777, runt= 10001msec
> read : io=939MB, bw=96,194KB/s, iops=24,048, runt= 10001msec
> read : io=765MB, bw=78,355KB/s, iops=19,588, runt= 10004msec
>
> Second run
> ==========
> read : io=443MB, bw=45,395KB/s, iops=11,348, runt= 10004msec
> read : io=1,058MB, bw=106MB/s, iops=27,081, runt= 10001msec
> read : io=650MB, bw=66,535KB/s, iops=16,633, runt= 10006msec
>
> Third Run
> =========
> read : io=727MB, bw=74,465KB/s, iops=18,616, runt= 10004msec
> read : io=890MB, bw=91,126KB/s, iops=22,781, runt= 10001msec
> read : io=406MB, bw=41,608KB/s, iops=10,401, runt= 10004msec
>
> Fourth Run
> ==========
> read : io=792MB, bw=81,143KB/s, iops=20,285, runt= 10001msec
> read : io=1,024MB, bw=102MB/s, iops=26,192, runt= 10009msec
> read : io=314MB, bw=32,093KB/s, iops=8,023, runt= 10011msec
>
> Still can't get the service difference proportionate to priority levels.
> In fact in some cases it is more like priority inversion where higher
> priority is getting lower BW.

Jeff's numbers are:
~/tmp/for-cz/for-2.6.33/output/be0-through-7.fio ~/tmp/for-cz/for-2.6.33
total priority: 880
total data transferred: 4064576
class prio ideal xferred %diff
be 0 831390 645764 -23
be 1 739013 562932 -24
be 2 646637 2097156 224
be 3 554260 250612 -55
be 4 461883 185332 -60
be 5 369506 149492 -60
be 6 277130 98036 -65
be 7 184753 75252 -60
~/tmp/for-cz/for-2.6.33
~/tmp/for-cz/for-2.6.33/output/be0-vs-be1.fio ~/tmp/for-cz/for-2.6.33
total priority: 340
total data transferred: 2244584
class prio ideal xferred %diff
be 0 1188309 1179636 -1
be 1 1056274 1064948 0
~/tmp/for-cz/for-2.6.33
~/tmp/for-cz/for-2.6.33/output/be0-vs-be7.fio ~/tmp/for-cz/for-2.6.33
total priority: 220
total data transferred: 2232808
class prio ideal xferred %diff
be 0 1826842 1834484 0
be 7 405965 398324 -2

There is one big outlier, but usually the transferred data is in line
with priority.
Seeing your numbers, though, where the process with intermediate
priority is almost consistently getting more bandwidth than the
others, I think it must be some bug in the code that caused both your
results and the outlier seen in Jeff's test.
I'll have a closer look at the interactions of the various parts of
the code, to see if I can spot the problem.

Thanks
Corrado
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stefan Schmidt: "[PATCH] fs/jbd: Export log_start_commit to fix ext3 build."
Previous message: Justin P. Mattock: "Re: [PATCH 3/3][RFC] tracing: Separate out x86 time stamp readingand ns conversion"
In reply to: Vivek Goyal: "Re: [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio:Change CFQ to use CFS like queue time stamps)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]