Re: [PATCH 00/14][V5] Introduce io.latency io controller for cgroups

From: Andrew Morton
Date: Mon Jul 02 2018 - 17:26:46 EST


On Fri, 29 Jun 2018 15:25:28 -0400 Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:

> This series adds a latency based io controller for cgroups. It is based on the
> same concept as the writeback throttling code, which is watching the overall
> total latency of IO's in a given window and then adjusting the queue depth of
> the group accordingly. This is meant to be a workload protection controller, so
> whoever has the lowest latency target gets the preferential treatment with no
> thought to fairness or proportionality. It is meant to be work conserving, so
> as long as nobody is missing their latency targets the disk is fair game.
>
> We have been testing this in production for several months now to get the
> behavior right and we are finally at the point that it is working well in all of
> our test cases. With this patch we protect our main workload (the web server)
> and isolate out the system services (chef/yum/etc). This works well in the
> normal case, smoothing out weird request per second (RPS) dips that we would see
> when one of the system services would run and compete for IO resources. This
> also works incredibly well in the runaway task case.
>
> The runaway task usecase is where we have some task that slowly eats up all of
> the memory on the system (think a memory leak). Previously this sort of
> workload would push the box into a swapping/oom death spiral that was only
> recovered by rebooting the box. With this patchset and proper configuration of
> the memory.low and io.latency controllers we're able to survive this test with a
> at most 20% dip in RPS.

Is this purely useful for spinning disks, or is there some
applicability to SSDs and perhaps other storage devices? Some
discussion on this topic would be useful.

Patches 5, 7 & 14 look fine to me - go wild. #14 could do with a
couple of why-we're-doing-this comments, but I say that about
everything ;)