Re: [PATCH v2] events/core: fix acoount failure for event's total_enable_time

From: Yeoreum Yun
Date: Thu Mar 06 2025 - 08:44:28 EST


> > Hi Peter,
> >
> > Sorry to late answer. I've missed your last repsonse in this thread,
> > and waiting for in new thread:
> > https://lore.kernel.org/all/20250110163643.GB4213@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> >
> > > > This patch doesn't work when the event is child event.
> > > > In case of parent's event, when you see the list_del_event(),
> > > > the total_enable_time is updated properly by changing state with
> > > > PERF_EVENT_STATE_OFF.
> > > >
> > > > However, child event's total_enable_time is added before list_del_event.
> > > > So, the missing total_enable_time isn't added to parents event and the
> > > > error print happens.
> > > >
> > > > So, I think it wouldn't be possible to update time with set_state.
> > > > instead I think it should update total_enable_time before
> > > > child's total_enable_time is added to parents' child_total_enable_time
> > > >
> > > > like
> > > >
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 065f9188b44a..d27717c44924 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -13337,6 +13337,7 @@ static void sync_child_event(struct perf_event *child_event)
> > > > }
> > > >
> > > > child_val = perf_event_count(child_event, false);
> > > > + perf_event_update_time(child_event);
> > > >
> > > > /*
> > > > * Add back the child's count to the parent's count:
> > >
> > > Well, that again violates the rule that we update time on state change.
> > >
> > > AFAICT there is no issue with simply moving the perf_event_set_state()
> > > up a few lines in __perf_remove_from_context().
> > >
> > > Notably event_sched_out() will already put us in OFF state; and nothing
> > > after that cares about further states AFAICT.
> > >
> > > So isn't the below the simpler solution?
> > >
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -2438,14 +2438,13 @@ __perf_remove_from_context(struct perf_e
> > > state = PERF_EVENT_STATE_DEAD;
> > > }
> > > event_sched_out(event, ctx);
> > > + perf_event_set_state(event, min(event->state, state));
> > > if (flags & DETACH_GROUP)
> > > perf_group_detach(event);
> > > if (flags & DETACH_CHILD)
> > > perf_child_detach(event);
> > > list_del_event(event, ctx);
> > >
> > > - perf_event_set_state(event, min(event->state, state));
> > > -
> > > if (!pmu_ctx->nr_events) {
> > > pmu_ctx->rotate_necessary = 0;
> >
> > Agree, for DETACH_EXIT case, below code in list_del_event()
> > doesn't need to be considered because
> > the all of event related to event ctx would be freed:
> >
> > /*
> > * If event was in error state, then keep it
> > * that way, otherwise bogus counts will be
> > * returned on read(). The only way to get out
> > * of error state is by explicit re-enabling
> > * of the event
> > */
> > if (event->state > PERF_EVENT_STATE_OFF) {
> > perf_cgroup_event_disable(event, ctx);
> > perf_event_set_state(event, PERF_EVENT_STATE_OFF);
> > }
> >
> > With your suggestion, Could I send the v4 for this?
>
> Yes, please send a -v4 version.
>
> Thanks,
>
> Ingo

Hi Ingo,
Sorry for late.

Here is v4:
- https://lore.kernel.org/all/20250306123350.1650114-1-yeoreum.yun@xxxxxxx/

Thanks!