Re: [PATCH 1/1] intel_pstate: Increase hold-off time before samples are scaled v2

From: Srinivas Pandruvada
Date: Tue Feb 23 2016 - 16:52:11 EST


On Tue, 2016-02-23 at 14:29 +0000, Mel Gorman wrote:
> Added a suggested change from Doug Smythies and can add a Signed-off-
> by
> if Doug is ok with that.
>
> Changelog since v1
> o Remove divide that is likely unnecessary (ds
> mythies)
> o Rebase on top of linux-pm/linux-next
>
> The PID relies on samples of equal time but this does not apply for
> deferrable timers when the CPU is idle. intel_pstate checks if the
> actual
> duration between samples is large and if so, the "busyness" of the
> CPU
> is scaled.
>
> This assumes the delay was a deferred timer but a workload may simply
> have
> been idle for a short time if it's context switching between a server
> and
> client or waiting very briefly on IO. It's compounded by the problem
> that
> server/clients migrate between CPUs due to wake-affine trying to
> maximise
> hot cache usage. In such cases, the cores are not considered busy and
> the
> frequency is dropped prematurely.
>
> This patch increases the hold-off value before the busyness is
> scaled. It
> was selected based simply on testing until the desired result was
> found.
> Tests were conducted with workloads that are either client/server
> based
> or short-lived IO.

Attached specpower comparison for Haswell EP Grantley server.Â

This workload ran about an hour+.

Difference in OPS:
+1019
Difference in power:
+308.6
Difference in perf/watt -312.479023

So we are consuming 308 Watts on average for doing 1019 operation more.

Thanks,
Srinivas


Attachment: HSW Grantley (hswep) spec_power (02_23_16).pdf
Description: Adobe PDF document