Re: IO scheduler based IO controller V10

From: Mike Galbraith
Date: Sat Sep 26 2009 - 10:51:36 EST


On Fri, 2009-09-25 at 16:26 -0400, Vivek Goyal wrote:
> On Fri, Sep 25, 2009 at 04:20:14AM +0200, Ulrich Lukas wrote:
> > Vivek Goyal wrote:
> > > Notes:
> > > - With vanilla CFQ, random writers can overwhelm a random reader.
> > > Bring down its throughput and bump up latencies significantly.
> >
> >
> > IIRC, with vanilla CFQ, sequential writing can overwhelm random readers,
> > too.
> >
> > I'm basing this assumption on the observations I made on both OpenSuse
> > 11.1 and Ubuntu 9.10 alpha6 which I described in my posting on LKML
> > titled: "Poor desktop responsiveness with background I/O-operations" of
> > 2009-09-20.
> > (Message ID: 4AB59CBB.8090907@xxxxxxxxxxxxxxxxx)
> >
> >
> > Thus, I'm posting this to show that your work is greatly appreciated,
> > given the rather disappointig status quo of Linux's fairness when it
> > comes to disk IO time.
> >
> > I hope that your efforts lead to a change in performance of current
> > userland applications, the sooner, the better.
> >
> [Please don't remove people from original CC list. I am putting them back.]
>
> Hi Ulrich,
>
> I quicky went through that mail thread and I tried following on my
> desktop.
>
> ##########################################
> dd if=/home/vgoyal/4G-file of=/dev/null &
> sleep 5
> time firefox
> # close firefox once gui pops up.
> ##########################################
>
> It was taking close to 1 minute 30 seconds to launch firefox and dd got
> following.
>
> 4294967296 bytes (4.3 GB) copied, 100.602 s, 42.7 MB/s
>
> (Results do vary across runs, especially if system is booted fresh. Don't
> know why...).
>
>
> Then I tried putting both the applications in separate groups and assign
> them weights 200 each.
>
> ##########################################
> dd if=/home/vgoyal/4G-file of=/dev/null &
> echo $! > /cgroup/io/test1/tasks
> sleep 5
> echo $$ > /cgroup/io/test2/tasks
> time firefox
> # close firefox once gui pops up.
> ##########################################
>
> Now I firefox pops up in 27 seconds. So it cut down the time by 2/3.
>
> 4294967296 bytes (4.3 GB) copied, 84.6138 s, 50.8 MB/s
>
> Notice that throughput of dd also improved.
>
> I ran the block trace and noticed in many a cases firefox threads
> immediately preempted the "dd". Probably because it was a file system
> request. So in this case latency will arise from seek time.
>
> In some other cases, threads had to wait for up to 100ms because dd was
> not preempted. In this case latency will arise both from waiting on queue
> as well as seek time.

Hm, with tip, I see ~10ms max wakeup latency running scriptlet below.

> With cgroup thing, We will run 100ms slice for the group in which firefox
> is being launched and then give 100ms uninterrupted time slice to dd. So
> it should cut down on number of seeks happening and that's why we probably
> see this improvement.

I'm not testing with group IO/CPU, but my numbers kinda agree that it's
seek latency that's THE killer. What the compiled numbers below from
the cheezy script below that _seem_ to be telling me is that the default
setting of CFQ quantum is allowing too many write requests through,
inflicting too much read latency... for the disk where my binaries live.
The longer the seeky burst, the more it hurts both reader/writer, so
cutting down the max requests queueable helps the reader (which i think
can't queue anything near per unit time that the writer can) finish and
get out of the writer's way sooner.

'nuff possibly useless words, onward to possibly useless numbers :)

dd pre == number dd emits upon receiving USR1 before execing perf.
perf stat == time to load/execute perf stat konsole -e exit.
dd post == same after dd number, after perf finishes.

quantum = 1 Avg
dd pre 58.4 52.5 56.1 61.6 52.3 56.1 MB/s
perf stat 2.87 0.91 1.64 1.41 0.90 1.5 Sec
dd post 56.6 61.0 66.3 64.7 60.9 61.9

quantum = 2
dd pre 59.7 62.4 58.9 65.3 60.3 61.3
perf stat 5.81 6.09 6.24 10.13 6.21 6.8
dd post 64.0 62.6 64.2 60.4 61.1 62.4

quantum = 3
dd pre 65.5 57.7 54.5 51.1 56.3 57.0
perf stat 14.01 13.71 8.35 5.35 8.57 9.9
dd post 59.2 49.1 58.8 62.3 62.1 58.3

quantum = 4
dd pre 57.2 52.1 56.8 55.2 61.6 56.5
perf stat 11.98 1.61 9.63 16.21 11.13 10.1
dd post 57.2 52.6 62.2 49.3 50.2 54.3

Nothing pinned btw, 4 cores available, but only 1 drive.

#!/bin/sh

DISK=sdb
QUANTUM=/sys/block/$DISK/queue/iosched/quantum
END=$(cat $QUANTUM)

for q in `seq 1 $END`; do
echo $q > $QUANTUM
LOGFILE=quantum_log_$q
rm -f $LOGFILE
for i in `seq 1 5`; do
echo 2 > /proc/sys/vm/drop_caches
sh -c "dd if=/dev/zero of=./deleteme.dd 2>&1|tee -a $LOGFILE" &
sleep 30
sh -c "echo quantum $(cat $QUANTUM) loop $i" 2>&1|tee -a $LOGFILE
perf stat -- killlall -q get_stuf_into_ram >/dev/null 2>&1
sleep 1
killall -q -USR1 dd &
sleep 1
sh -c "perf stat -- konsole -e exit" 2>&1|tee -a $LOGFILE
sleep 1
killall -q -USR1 dd &
sleep 5
killall -qw dd
rm -f ./deleteme.dd
sync
sh -c "echo" 2>&1|tee -a $LOGFILE
done;
done;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/