Re: [PATCH v2] tools/power turbostat: Fix RAPL summary collection on AMD processors

From: Chen Yu
Date: Fri Apr 23 2021 - 21:31:23 EST


On Fri, Apr 23, 2021 at 10:00:14AM -0400, Calvin Walton wrote:
> On Fri, 2021-04-23 at 14:19 +0200, Borislav Petkov wrote:
> > On Fri, Apr 23, 2021 at 08:16:07PM +0800, Chen Yu wrote:
> > > From b2e63fe4f02e17289414b4f61237da822df115fb Mon Sep 17 00:00:00
> > > 2001
> > > From: Calvin Walton <calvin.walton@xxxxxxxxxx>
> > > Date: Fri, 23 Apr 2021 17:32:13 +0800
> > > Subject: [PATCH 3/5] tools/power turbostat: Fix offset overflow
> > > issue in index
> > >  converting
> > >
> > > The idx_to_offset() function returns type int (32-bit signed), but
> > > MSR_PKG_ENERGY_STAT is greater than INT_MAX (or rather, would be
> > > interpreted as a negative number). The end result is that it hits
> > > the if (offset < 0) check in update_msr_sum() resulting in the
> > > timer
> > > callback for updating the stat in the background when long
> > > durations
> > > are used to not happen. The similar issue exists in offset_to_idx()
> > > and update_msr_sum().
> > >
> > > This patch fixes this issue by converting the 'int' type to 'off_t'
> > > accordingly.
> > >
> > > Fixes: 9972d5d84d76 ("tools/power turbostat: Enable accumulate RAPL
> > > display")
> > > Signed-off-by: Chen Yu <yu.c.chen@xxxxxxxxx>
> >
> > This patch's authorship is weird: it says From: Calvin but doesn't
> > have
> > his SOB here - only yours.
>
> I think this patch is adapted from one of my earlier submissions? I
> don't think I can really say that I wrote it, but I'll certainly review
> it.
>
> >
> > > ---
> > >  tools/power/x86/turbostat/turbostat.c | 10 +++++-----
> > >  1 file changed, 5 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/tools/power/x86/turbostat/turbostat.c
> > > b/tools/power/x86/turbostat/turbostat.c
> > > index a211264b57fd..77557122b292 100644
> > > --- a/tools/power/x86/turbostat/turbostat.c
> > > +++ b/tools/power/x86/turbostat/turbostat.c
> > > @@ -296,9 +296,9 @@ struct msr_sum_array {
> > >  /* The percpu MSR sum array.*/
> > >  struct msr_sum_array *per_cpu_msr_sum;
> > >  
> > > -int idx_to_offset(int idx)
> > > +off_t idx_to_offset(int idx)
> >
> > And this is silly. MSRs are unsigned int. Fullstop.
> >
> > So that function should either return u32 or unsigned int or so.
>
> So, there's two problems with that:
> 1. This function needs to be able to return an error value that cannot be
> confused with a valid MSR. This is currently done by returning a
> negative number. If an unsigned value is used, a different way of
> indicating errors needs to be written.
> 2. We are not using CPU instructions to access MSRs direction. Instead
> they are being read from /dev/msr. So the "offset" value is actually a
> seek into the /dev/msr file (using pread), and thus is of type off_t.
>
I see, I misunderstood it with kernel's rdmsr() and ... I'll use the original version.

thanks,
Chenyu
> --
> Calvin Walton <calvin.walton@xxxxxxxxxx>
>