Re: [PATCH v2 2/4] perf: Align user space counter reading with code

From: Ian Rogers
Date: Tue Jul 19 2022 - 18:42:37 EST


On Tue, Jul 19, 2022 at 10:45 AM Rob Herring <robh@xxxxxxxxxx> wrote:
>
> On Wed, Jun 8, 2022 at 11:24 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >
> > Align the user space counter reading documentation with the code in
> > perf_mmap__read_self. Previously the documentation was based on the perf
> > rdpmc test, but now general purpose code is provided by libperf.
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > include/uapi/linux/perf_event.h | 32 ++++++++++++++++-----------
> > tools/include/uapi/linux/perf_event.h | 32 ++++++++++++++++-----------
> > 2 files changed, 38 insertions(+), 26 deletions(-)
> >
> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > index d37629dbad72..3b84e0ad0723 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -538,9 +538,13 @@ struct perf_event_mmap_page {
> > *
> > * if (pc->cap_usr_time && enabled != running) {
> > * cyc = rdtsc();
> > - * time_offset = pc->time_offset;
> > * time_mult = pc->time_mult;
> > * time_shift = pc->time_shift;
> > + * time_offset = pc->time_offset;
> > + * if (pc->cap_user_time_short) {
> > + * time_cycles = pc->time_cycles;
> > + * time_mask = pc->time_mask;
> > + * }
> > * }
> > *
> > * index = pc->index;
> > @@ -548,6 +552,9 @@ struct perf_event_mmap_page {
> > * if (pc->cap_user_rdpmc && index) {
> > * width = pc->pmc_width;
> > * pmc = rdpmc(index - 1);
> > + * pmc <<= 64 - width;
> > + * pmc >>= 64 - width;
> > + * count += pmc;
> > * }
> > *
> > * barrier();
> > @@ -590,25 +597,24 @@ struct perf_event_mmap_page {
> > * If cap_usr_time the below fields can be used to compute the time
> > * delta since time_enabled (in ns) using rdtsc or similar.
> > *
> > - * u64 quot, rem;
> > - * u64 delta;
> > - *
> > - * quot = (cyc >> time_shift);
> > - * rem = cyc & (((u64)1 << time_shift) - 1);
> > - * delta = time_offset + quot * time_mult +
> > - * ((rem * time_mult) >> time_shift);
> > + * cyc = time_cycles + ((cyc - time_cycles) & time_mask);
> > + * delta = time_offset + mul_u64_u32_shr(cyc, time_mult, time_shift);
>
> I still think this chunk should stay as-is because mul_u64_u32_shr
> isn't defined here. At least a comment as to what the C version does:
>
> /* time_offset + (u64)(((unsigned __int128)cyc * time_mult) >> time_shift) */
>
> > *
> > * Where time_offset,time_mult,time_shift and cyc are read in the
> > * seqcount loop described above. This delta can then be added to
> > - * enabled and possible running (if index), improving the scaling:
> > + * enabled and possible running (if index) to improve the scaling. Due
> > + * to event multiplexing, running maybe zero and so care is needed to
>
> s/maybe/may be/
>
> > + * avoid division by zero.
> > *
> > * enabled += delta;
> > - * if (index)
> > + * if (idx)
>
> This should remain 'index'.

Thanks, I've tried to incorporate all of this into v3:
https://lore.kernel.org/lkml/20220719223946.176299-1-irogers@xxxxxxxxxx/

Ian

> > * running += delta;
> > *
> > - * quot = count / running;
> > - * rem = count % running;
> > - * count = quot * enabled + (rem * enabled) / running;
> > + * if (running != 0) {
> > + * quot = count / running;
> > + * rem = count % running;
> > + * count = quot * enabled + (rem * enabled) / running;
> > + * }
> > */
> > __u16 time_shift;
> > __u32 time_mult;