Re: [PATCH] cpufreq: intel_pstate: Optimize IO boost in non HWP mode
From: Francisco Jerez
Date: Thu Sep 06 2018 - 00:39:23 EST
Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx> writes:
> [...]
>
>> > >
>> > > This patch causes a number of statistically significant
>> > > regressions
>> > > (with significance of 1%) on the two systems I've tested it
>> > > on. On
>> > > my
>> >
>> > Sure. These patches are targeted to Atom clients where some of
>> > these
>> > server like workload may have some minor regression on few watts
>> > TDP
>> > parts.
>>
>> Neither the 36% regression of fs-mark, the 21% regression of sqlite,
>> nor
>> the 10% regression of warsaw qualify as small. And most of the test
>> cases on the list of regressions aren't exclusively server-like, if
>> at
>> all. Warsaw, gtkperf, jxrendermark and lightsmark are all graphics
>> benchmarks -- Latency is as important if not more for interactive
>> workloads than it is for server workloads. In the case of a conflict
>> like the one we're dealing with right now between optimizing for
>> throughput (e.g. for the maximum number of requests per second) and
>> optimizing for latency (e.g. for the minimum request duration), you
>> are
>> more likely to be concerned about the former than about the latter in
>> a
>> server setup.
>
> Eero,
> Please add your test results here.
>
> No matter which algorithm you do, there will be variations. So you have
> to look at the platforms which you are targeting. For this platform
> number one item is use of less turbo and hope you know why?
Unfortunately the current controller uses turbo frequently on Atoms for
TDP-limited graphics workloads regardless of IOWAIT boosting. IOWAIT
boosting simply exacerbated the pre-existing energy efficiency problem.
> On this platform GFX patch caused this issue as it was submitted after
> io boost patchset. So rather that should be reverted first before
> optimizing anything.
>
>
>
>>
>> > But weighing against reduced TURBO usage (which is enough trigger)
>> > and
>> > improvement in tests done by Eero (which was primary complaint to
>> > us).
>> >
>> > It is trivial patch. Yes, the patch is not perfect and doesn't
>> > close
>> > doors for any improvements.
>> >
>>
>> It's sort of self-contained because it's awfully incomplete.Don't
>> agtr
>
>>
>> > I see your idea, but how to implement in acceptable way is a
>> > challenge.
>>
>> Main challenge was getting the code to work without regressions in
>> latency-sensitive workloads, which I did, because you told me that it
>> wasn't acceptable for it to cause any regressions on the Phoronix
>> daily-system-tracker, whether latency-bound or not.
> Yes, because your intention was to have a global change not a low power
> Atom specific,
>
My intention was to target low-power Atoms only since the first day we
discussed this problem. The cover letter of the series I sent and the
commit messages make this fairly clear.
>> What is left in
>> order to address Peter's concerns is for the most part plumbing,
>> that's
>> guaranteed not to have any functional impact on the heuristic. The
>> fact
>> that we don't expect it to change the performance of the system makes
>> it
>> substantially less time-consuming to validate than altering the
>> performance trade-offs of the heuristic dynamically in order to avoid
>> regressions (which is what has kept my systems busy most of the time
>> lately). Seems like my series, even in its current state without the
>> changes requested by Peter is closer to being ready for production
>> than
>> this patch is.
>
> Sorry, Not at all. We call such patches as experimental series.
The numbers speak otherwise.
> You caused 100% regression to idle power. There is no version 2 after
> that, even if you fixed locally even to look.
>
I did post a link to a v2 that fixed the idle power issue on the v1
thread, but I didn't intend to send it for review yet. I'll send it out
once I've fully taken into account Peter's feedback.
> Thanks,
> Srinivas
Attachment:
signature.asc
Description: PGP signature