Re: [PATCH] maximize dispatching in block throttle
From: Vivek Goyal
Date: Fri Dec 03 2010 - 09:33:06 EST
On Fri, Dec 03, 2010 at 10:26:50PM +0800, Hillf Danton wrote:
> On Wed, Dec 1, 2010 at 10:41 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> > On Wed, Dec 01, 2010 at 09:30:00PM +0800, Hillf Danton wrote:
> >> On Tue, Nov 30, 2010 at 10:57 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >> > On Fri, Nov 26, 2010 at 10:46:01PM +0800, Hillf Danton wrote:
> >> >> When dispatching bio, the quantum is divided into read/write budgets,
> >> >> and dispatching for write could not exceed the write budget even if
> >> >> the read budget is not exhausted, either dispatching for read.
> >> >>
> >> >> It is changed to exhaust the quantum, if possible, in this work for
> >> >> dispatching bio.
> >> >>
> >> >> Though it is hard to understand that 50/50 division is not selected,
> >> >> the difference between divisions could impact little on dispatching as
> >> >> much as quantum allows then.
> >> >>
> >> >> Signed-off-by: Hillf Danton <dhillf@xxxxxxxxx>
> >> >> ---
> >> >
> >> > Hi Hillf,
> >> >
> >> > Even if there are not enough READS/WRITES to consume the quantum, I don't
> >> > think that it changes anyting much. The next dispatch round will be
> >>
> >> Why not exhaust quantum this round directly if not throttled?
> >>
> >
> > Yes we can do that. I am wondering what do we gain by putting extra code
> > in?
> >
> >> > scheduled almost immediately (If there are bios which are ready to
> >> > be dispatched). Look at throtl_schedule_next_dispatch().
> >> >
> >> > Have you noticed some issues/improvements with this patch?
> >>
> >> Originally I wanted to get 75/25 division parameterized through sysfs or proc,
> >> but I changed mind since exhausting quantum is simpler and much applicable.
> >
> > But these are two different things isn't it?
> >
> > We can introduce some sysfs tunables (if need be) so that a user can decide
> > the division of READS/WRITES.
> >
> >>
> >> As you see, the application environments change from one user to another,
> >> even though are more latency sensitive.
> >>
> >> It looks nicer, I think, to provide both the default division and the
> >> methods to
> >> change for users to play their games.
> >
> > Yep, tunables can help here and introducing tunables is easy. At the same
>
> If it is too hard, please change to what is more valuable.
> Hillf
It should not be too hard. IO schedulers already create
/sys/block/<dev>/queue/iosched/ dir and we can create <dev>/queue/throttle/
dir and export throttle related tunables there.
What I am not clear about is that what problem are you running into.
Tunbales and your patch to fill the quantum are two different things. So
how can both the things solve your problem. One of these should.
So if you can explain the issue you are facing with some more details, it
will help. I am not sure how your patch of filling the quantum with
requests of other direction is helping you?
Vivek
>
> > time I prefer to intoroduce tunables only on need basis otherwise it
> > runs the risk of too many tunables which are not being used.
> >
> > So, conceptually this patch looks fine to me. Jeff Moyer also raised the
> > same question of why not fill up the quantum if READ/WRITE are not using
> > their quota. So I think it does not hurt to take this patch in.
> >
> > Acked-by: Vivek Goyal <vgoyal@xxxxxxxxxx>
> >
> > Thanks
> > Vivek
> >
> >>
> >> Thanks
> >> Hillf
> >>
> >> >
> >> > Generally READS are more latency sensitive as compared to WRITE hence
> >> > I thought of dispatching more READS per quantum.
> >> >
> >> > Thanks
> >> > Vivek
> >> >
> >> >>
> >> >> --- a/block/blk-throttle.c 2010-11-01 19:54:12.000000000 +0800
> >> >> +++ b/block/blk-throttle.c 2010-11-26 21:49:00.000000000 +0800
> >> >> @@ -647,11 +647,16 @@ static int throtl_dispatch_tg(struct thr
> >> >> unsigned int max_nr_reads = throtl_grp_quantum*3/4;
> >> >> unsigned int max_nr_writes = throtl_grp_quantum - nr_reads;
> >> >> struct bio *bio;
> >> >> + int read_throttled = 0, write_throttled = 0;
> >> >>
> >> >> /* Try to dispatch 75% READS and 25% WRITES */
> >> >> -
> >> >> + try_read:
> >> >> while ((bio = bio_list_peek(&tg->bio_lists[READ]))
> >> >> - && tg_may_dispatch(td, tg, bio, NULL)) {
> >> >> + && ! read_throttled) {
> >> >> + if (! tg_may_dispatch(td, tg, bio, NULL)) {
> >> >> + read_throttled = 1;
> >> >> + break;
> >> >> + }
> >> >>
> >> >> tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
> >> >> nr_reads++;
> >> >> @@ -659,9 +664,15 @@ static int throtl_dispatch_tg(struct thr
> >> >> if (nr_reads >= max_nr_reads)
> >> >> break;
> >> >> }
> >> >> -
> >> >> + if (! bio)
> >> >> + read_throttled = 1;
> >> >> + try_write:
> >> >> while ((bio = bio_list_peek(&tg->bio_lists[WRITE]))
> >> >> - && tg_may_dispatch(td, tg, bio, NULL)) {
> >> >> + && ! write_throttled) {
> >> >> + if (! tg_may_dispatch(td, tg, bio, NULL)) {
> >> >> + write_throttled = 1;
> >> >> + break;
> >> >> + }
> >> >>
> >> >> tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
> >> >> nr_writes++;
> >> >> @@ -669,7 +680,23 @@ static int throtl_dispatch_tg(struct thr
> >> >> if (nr_writes >= max_nr_writes)
> >> >> break;
> >> >> }
> >> >> + if (! bio)
> >> >> + write_throttled = 1;
> >> >> +
> >> >> + if (write_throttled && read_throttled)
> >> >> + goto out;
> >> >>
> >> >> + if (! (throtl_grp_quantum > nr_writes + nr_reads))
> >> >> + goto out;
> >> >> +
> >> >> + if (read_throttled) {
> >> >> + max_nr_writes = throtl_grp_quantum - nr_reads;
> >> >> + goto try_write;
> >> >> + } else {
> >> >> + max_nr_reads = throtl_grp_quantum - nr_writes;
> >> >> + goto try_read;
> >> >> + }
> >> >> + out:
> >> >> return nr_reads + nr_writes;
> >> >> }
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> >> Please read the FAQ at http://www.tux.org/lkml/
> >> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/