Re: [PATCH v3] block: make iolatency avg_lat exponentially decay

From: Tejun Heo
Date: Thu Aug 02 2018 - 12:04:58 EST


Hello, Dennis.

On Wed, Aug 01, 2018 at 11:15:41PM -0700, Dennis Zhou wrote:
> From: "Dennis Zhou (Facebook)" <dennisszhou@xxxxxxxxx>
>
> Currently, avg_lat is calculated by accumulating the mean of every
> window in a long running cumulative average. As time goes on, the metric
> becomes less and less useful due to the accumulated history.
>
> This patch reuses the same calculation done in load averages to make the
> avg_lat metric more lively. Unlike load averages, the avg only advances
> when a window elapses (due to an io). Idle periods extend the most
> recent window. Bucketing is used to limit the history of avg_lat by
> binding it to the window size. So, the window range for 1/exp (decay
> rate) is [1 min, 2.5 min) when windows elapse immediately.
>
> The current sample window size is exposed in the debug info to enable
> calculation of the window range.
>
> Signed-off-by: Dennis Zhou <dennisszhou@xxxxxxxxx>
> Acked-by: Tejun Heo <tj@xxxxxxxxxx>
> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Acked-by: Josef Bacik <josef@xxxxxxxxxxxxxx>

Heh, sorry about not thinking this up from the beginning but I think
it'd be great to move this to blkcg core so that the stat is always
available regardless of blk-iolatency. It's a really important metric
and can be used for both monitoring and policy implementation.

Thanks!

--
tejun