Re: testing io.low limit for blk-throttle

From: Paolo Valente
Date: Fri Apr 27 2018 - 01:14:31 EST

> Il giorno 27 apr 2018, alle ore 05:27, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto:
> Hi Paolo,
> On 18/4/27 01:27, Paolo Valente wrote:
>>> Il giorno 25 apr 2018, alle ore 14:13, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto:
>>> Hi Paolo,
>> Hi Joseph
>>> ...
>>> Could you run blktrace as well when testing your case? There are several
>>> throtl traces to help analyze whether it is caused by frequently
>>> upgrade/downgrade.
>> Certainly. You can find a trace attached. Unfortunately, I'm not
>> familiar with the internals of blk-throttle and low limit, so, if you
>> want me to analyze the trace, give me some hints on what I have to
>> look for. Otherwise, I'll be happy to learn from your analysis.
> I've taken a glance at your blktrace attached. It is only upgrade at first and
> then downgrade (just adjust limit, not to LIMIT_LOW) frequently.
> But I don't know why it always thinks throttle group is not idle.
> For example:
> fio-2336 [004] d... 428.458249: 8,16 m N throtl avg_idle=90, idle_threshold=1000, bad_bio=10, total_bio=84, is_idle=0, scale=9
> fio-2336 [004] d... 428.458251: 8,16 m N throtl downgrade, scale 4
> In throtl_tg_is_idle():
> is_idle = ... ||
> (tg->latency_target && tg->bio_cnt &&
> tg->bad_bio_cnt * 5 < tg->bio_cnt);
> It should be idle and allow run more bandwidth. But here the result shows not
> idle (is_idle=0). I have to do more investigation to figure it out why.

Hi Joseph,
actually this doesn't surprise me much, for this scenario I expected
exactly that blk-throttle would have considered the random-I/O group,
for most of the time,
1) non idle,
2) above the 100usec target latency, and
3) below low limit,

In fact,
1) The group can evidently issue I/O at a much higher rate than that
received, so, immediately after its last pending I/O has been served,
the group issues new I/O; in the end, it is is non idle most of the
2) To try to enforce the 10MB/s limit, blk-throttle necessarily makes
the group oscillate around 10MB/s, which means that the group is
frequently below limit (this would not have held only if the group had
actually received much more than 10MB/s, but it is not so)
3) For each of the 4k random I/Os of the group, the time needed by the
drive to serve that I/O is already around 40-50usec. So, since the
group is of course not constantly in service, it is very easy that,
because of throttling, the latency of most I/Os of the group goes
beyond 100usec.

But, as it is often the case for me, I might have simply misunderstood
blk-throttle parameters, and I might be just wrong here.


> You can also filter these logs using:
> grep throtl trace | grep -E 'upgrade|downgrade|is_idle'
> Thanks,
> Joseph