RE: [PATCH 1/1] perf, core: Use sample period avg as child event's initial period

From: Liang, Kan
Date: Tue Dec 16 2014 - 11:07:42 EST




>
> On Mon, Dec 15, 2014 at 09:17:33PM +0000, Liang, Kan wrote:
> > > This doesn't seem to make any kind of sense, and its weirdly
> implemented.
> > >
> > > So why would you push anything to the original parent? Your
> > > description states that the parent event usually has 1, and then you
> > > argue about fixing that by using the orig parent, but then you need to
> update the orig parent.
> > > Did you go in circles and confuse yourself? Why not push things into
> > > the regular parent event if you're going to push things up.
> >
> > My thought is that the original parent is the root of the tree.
>
> No parent is the root; I thought your orig parent thing was the event you
> forked from, but now I see its not.
>
> See inherit_event(), event->parent is the root event.
>
I c. I will keep the information in regular parent event.

> > If there is an
> > average sample period for nodes, it should be kept in the root node,
> > since it's the only node everyone knows.
>
> Right, but, that's also contention central..
>
> > > Also, since you can have multiple child events, on many CPUs
> > > local64_t is the wrong data type, furthermore its going to be a
> > > scalability issue on big hardware.
> >
> > I'd like to have avg_sample_period for each CPU. The similar usage is
> > period_left in hw_perf_event.
>
> Well, some events are per cpu, some are per task. The per task events do
> not have per-cpu storage and their parent can be on whatever cpu.
>

I will use atomic64_t to replace local64_t.

> > We don't need to share the avg_sample_period between CPUs, after all
> > it's only a reference.
>
> Right, some smarts are needed to avoid the worst contention there.
> Maybe a jiffy timestamp and don't update more than once every HZ jiffies
> or so.

perf_adjust_freq_unthr_context will be called each tick. I think we may
only update the avg_sample_period in this function.
Also, I think we'd better to update the avg_sample_period in first period
adjust. We need to get rid of 1 period as earlier as possible.
I will change the update part as below.

@@ -2810,8 +2833,10 @@ static void perf_adjust_period(struct perf_event
*event, u64 nsec, u64 count, bo

hwc->sample_period = sample_period;

+ if (!disable || (atomic64_read(&head_event->avg_sample_period)
== 1)) {
+ avg_period = (atomic64_read(&head_event->avg_sample_period
) + sample_period) / 2;
+ atomic64_set(&head_event->avg_sample_period, avg_period);
+ }

if (local64_read(&hwc->period_left) > 8*sample_period) {
if (disable)

Thanks,
Kan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/