Re: [RFC/RFT][PATCH] cpufreq: intel_pstate: Accept passive mode with HWP enabled

From: Srinivas Pandruvada
Date: Sun May 31 2020 - 14:59:37 EST


On Sun, 2020-05-31 at 11:06 -0700, Doug Smythies wrote:
> Hi Srinivas,
>
> Thanks you for your quick reply.
>
> On 2020.05.31 09:54 Srinivas Pandruvada wrote
> > On Sun, 2020-05-31 at 09:39 -0700, Doug Smythies wrote:
> > > Event begins at 17.456 seconds elapsed time.
> > > Previous event was about 107 milliseconds ago.
> > >
> > > Old min ; new min ; freq GHz; load % ; duration mS
> > > 27 ; 28 ; 4.60 ; 68.17 ; 10.226
> > > 28 ; 26 ; 4.53 ; 57.47 ; 10.005
> >
> > Seems you hit power/thermal limit
>
> No.
>
> I am nowhere near any power limit at all.
> I have meticulously configured and tested the thermal management of
> this computer.
> I never ever hit a thermal limit and have TDP set such that the
> processor
> temperature never exceeds about 75 degrees centigrade.
>
> There should never be throttling involved in these experiments.
> I can achieve throttling when compiling the kernel and with
> torture test mode on the mprime test (other CPU stressors,
> including my own, are not as good at generating heat as
> mprime).
>
> This system can run indefinitely at 99.9 watts processor package
> power.
> Example (turbostat, steady state, CPU freq throttled to 4.04 GHz):
>
> doug@s18:~$ sudo ~/turbostat --Summary --quiet --show
> Busy%,Bzy_MHz,PkgTmp,PkgWatt,GFXWatt,IRQ --interval 12
> Busy% Bzy_MHz IRQ PkgTmp PkgWatt GFXWatt
> 100.21 4045 72231 66 99.93 0.00
> 100.21 4043 72239 65 99.92 0.00
>
> > Is this some Lenovo system?
>
> No. The web page version of my original e-mail has
> a link to the test computer hardware profile.
>
> The motherboard is ASUS PRIME Z390-P.
>

OK, this seems a desktop system.

> > If you disable HWP you don't see that?
>
> Correct.
>
> > What is the value of
> > cat /sys/bus/pci/devices/0000\:00\:04.0/tcc_offset_degree_celsius
>
> ? "No such file or directory"
>

> > cat /sys/class/powercap/intel-rapl-mmio/intel-rapl-
> > mmio:0/constraint_0_power_limit_uw
>
You may not have
CONFIG_INT340X_THERMAL=y

What is
#rdmsr 0x1a2

Try changing energy_perf_bias and see if it helps here.

Thanks,
Srinivas


> ? "No such file or directory"
>
> > You may want to run
> > Try running dptfxtract once.
>
> No, I am not going to.
>
> I am not running thermald. Eventually I will, as a backup
> in case of cooling failure, so as not to hit the processor limit
> shutdown. I just haven't done it yet.
>
> > Then try to get again
> >
> > cat /sys/bus/pci/devices/0000\:00\:04.0/tcc_offset_degree_celsius
> > cat /sys/class/powercap/intel-rapl-mmio/intel-rapl-
> > mmio:0/constraint_0_power_limit_uw
> >
> >
> > Thanks,
> > Srinivas
>
>