Re: [PATCH v2] tools/power turbostat: Fix RAPL summary collection on AMD processors

From: Chen Yu
Date: Tue Apr 20 2021 - 10:34:13 EST


On Tue, Apr 20, 2021 at 09:28:06AM -0400, Calvin Walton wrote:
> On Tue, 2021-04-20 at 21:15 +0800, Chen Yu wrote:
> >
> > Okay. I would vote for the the patch from Bas as it was a combined
> > work from two
> > authors and tested by several AMD users. But let me paste it here too
> > for Artem to
> > see if this also works for him:
> >
> >
> > From 00e0622b1b693a5c7dc343aeb3aa51614a9e125e Mon Sep 17 00:00:00
> > 2001
> > From: Bas Nieuwenhuizen <bas@xxxxxxxxxxxxxxxxxxx>
> > Date: Fri, 12 Mar 2021 21:27:40 +0800
> > Subject: [PATCH] tools/power/turbostat: Fix turbostat for AMD Zen
> > CPUs
> >
> >
> > @@ -297,7 +297,10 @@ int idx_to_offset(int idx)
> >  
> >         switch (idx) {
> >         case IDX_PKG_ENERGY:
> > -               offset = MSR_PKG_ENERGY_STATUS;
> > +               if (do_rapl & RAPL_AMD_F17H)
> > +                       offset = MSR_PKG_ENERGY_STAT;
> > +               else
> > +                       offset = MSR_PKG_ENERGY_STATUS;
> >                 break;
> >         case IDX_DRAM_ENERGY:
> >                 offset = MSR_DRAM_ENERGY_STATUS;
>
> This patch has the same issue I noticed with the initial revision of
> Terry's patch - the idx_to_offset function returns type int (32-bit
> signed), but MSR_PKG_ENERGY_STAT is greater than INT_MAX (or rather,
> would be interpreted as a negative number)
>
> The end result is, as far as I can tell, that it hits the if (offset <
> 0) check in update_msr_sum() resulting in the timer callback for
> updating the stat in the background when long durations are used to not
> happen.
>
> For short durations it still works fine since the background update
> isn't used.
>
Ah, got it, nice catch. How about an incremental patch based on Bas' one
to fix this 'overflow' issue? Would converting offset_to_idx(), idx_to_offset() and
update_msr_sum() to use off_t instead of int be enough? Do you or Terry have interest
to cook that patch? For Terry's version, I'm not sure if spliting
the code into different CPU vendor would benefit in the future, except
that we would have plenty of new MSRs to be introduced in the future.

thanks,
Chenyu
>
> --
> Calvin Walton <calvin.walton@xxxxxxxxxx>
>