Re: [RFC]cfq-iosched: no dispatch limit for single queue

From: Jens Axboe
Date: Sat Dec 05 2009 - 03:51:11 EST


On Fri, Dec 04 2009, Corrado Zoccolo wrote:
> Hi Shaohua, Jens,
> On Thu, Dec 3, 2009 at 12:57 PM, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> > On Thu, Dec 03 2009, Shaohua Li wrote:
> >> Since commit 2f5cb7381b737e24c8046fd4aeab571fb71315f5, each queue can send
> >> up to 4 * 4 requests if only one queue exists. I wonder why we have such limit.
> >> Device supports tag can send more requests. For example, AHCI can send 31
> >> requests. Test (direct aio randread) shows the limits reduce about 4% disk
> >> thoughput.
> >> On the other hand, since we send one request one time, if other queue
> >> pop when current is sending more than cfq_quantum requests, current queue will
> >> stop send requests soon after one request, so sounds there is no big latency.
> >>
> >> Signed-off-by: Shaohua Li <shaohua.li@xxxxxxxxx>
> >>
> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> >> index aa1e953..e05650f 100644
> >> --- a/block/cfq-iosched.c
> >> +++ b/block/cfq-iosched.c
> >> @@ -1298,9 +1298,9 @@ static bool cfq_may_dispatch(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> >>                       return false;
> >>
> >>               /*
> >> -              * Sole queue user, allow bigger slice
> >> +              * Sole queue user, no limit
> >>                */
> >> -             max_dispatch *= 4;
> >> +             max_dispatch = -1;
> >>       }
> >>
> >>       /*
> >
> > As you mention, we do dispatches in bites of 1. In reality, there's
> > going to be little difference when we get this far in the depth process,
> > so I think the patch looks good. I have applied it, thanks.
>
> I think the limit should be removed only for sync queues.
> For async queues, if cfq_latency is not set, removing the limit here can
> cause very high latencies to sync queues (almost 100% increase),
> without a noticeable throughput gain.

It's always problematic to say 'without a noticable throughput gain', as
on some workloads/storage, the difference between 16 and eg 32 in depth
WILL be noticeable. 16 is already high enough that if we hit that limit,
it will cause a latency hit. The hope here is that larger wont make it
much worse, but we'll see.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/