Re: [RFC 2/2] Track io_ticks at microsecond granularity.

From: Ming Lei
Date: Tue Jun 09 2020 - 04:50:25 EST


On Mon, Jun 08, 2020 at 09:07:24PM -0700, Josh Snyder wrote:
> Previously, we performed truncation of I/O issue/completion times during
> calculation of io_ticks, counting only I/Os which cross a jiffy
> boundary. The effect is a sampling of I/Os: at every boundary between
> jiffies we ask "is there an outstanding I/O" and increment a counter if
> the answer is yes. This produces results that are accurate (they don't
> systematically over- or under-count), but not precise (there is high
> variance associated with only taking 100 samples per second).
>
> This change modifies the sampling rate from 100Hz to 976562.5Hz (1
> sample per 1024 nanoseconds). I chose this sampling rate by simulating a
> workload in which I/Os are issued randomly (by a Poisson process), and
> processed in constant time: an M/D/â system (Kendall's notation). My
> goal was to produce a sampled utilization fraction which was correct to
> one part-per-thousand given one second of samples.
>
> The tradeoff of the higher sampling rate is increased synchronization
> overhead caused by more frequent compare-and-swap operations. The
> technique of commit 5b18b5a73760 ("block: delete part_round_stats and
> switch to less precise counting") is to allow multiple I/Os to complete
> while performing only one synchronized operation. As we are increasing
> the sample rate by a factor of 10000, we will less frequently be able to
> exercise the synchronization-free code path.

Not sure if we need so precise %util, and ~1M sampling rate may cause to run
cmpxchg() 1M/sec for each partition, which overhead might be observable.

Thanks,
Ming