Re: perf: some questions about perf software events

From: Franck Bui-Huu
Date: Sat Nov 27 2010 - 08:39:16 EST


Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> writes:

> On Wed, 2010-11-24 at 12:35 +0100, Franck Bui-Huu wrote:

[...]

>> That is for no 'contiguous' events, setting a sampling frequency doesn't
>> really make sense since for example you could set a frequency to 1000 HZ
>> for the software ALIGNMENT_FAULT event and never get any samplings or at
>> least getting sampling but with a totally different rate. And the
>> current code doesn't look to handle sample_freq anyway.
>
> All the freq bits are in the generic code, it re-computes the rate on
> the timer-tick as well as on each event occurrence.
>
> Freq driven sampling should work just fine with swevents.
>

Yes, but how does it behave with ALIGNMENT_FAULTS for example ?

Such event may happen at a very disparate rate or it can even never
happen at all.

>
>> Also I'm currently not seeing any real differences between cpu-clock and
>> task-clock events. They both seem to count the time elapsed when the
>> task is running on a CPU. Am I wrong ?
>
> No, Francis already noticed that, I probably wrecked it when I added the
> multi-pmu stuff, its on my todo list to look at (Francis also handed me
> a little patchlet), but I keep getting distracted with other stuff :/

OK.

Does it make sense to adjust the period for both of them ?

Also, when creating a task clock event, passing 'pid=-1' to
sys_perf_event_open() doesn't really make sense, does it ?

Same with cpu clock and 'pid=n': whatever <n> value, the event measure
the cpu wall time clock.

Perhaps proposing only one clock in the API and internally bind this
clock to the cpu or task clock depending on pid or cpu parameters would
have been better ?

Something like the following:


diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index bb1884c..ad50551 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -105,6 +105,7 @@ enum perf_sw_ids {
PERF_COUNT_SW_PAGE_FAULTS_MAJ = 6,
PERF_COUNT_SW_ALIGNMENT_FAULTS = 7,
PERF_COUNT_SW_EMULATION_FAULTS = 8,
+ PERF_COUNT_SW_CLOCK = 9,

PERF_COUNT_SW_MAX, /* non-ABI */
};
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 1dabb54..f3ff342 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4981,7 +4981,14 @@ static int cpu_clock_event_init(struct perf_event *event)
if (event->attr.type != PERF_TYPE_SOFTWARE)
return -ENOENT;

- if (event->attr.config != PERF_COUNT_SW_CPU_CLOCK)
+ switch (event->attr.config) {
+ case PERF_COUNT_SW_CPU_CLOCK:
+ break;
+ case PERF_COUNT_SW_CLOCK:
+ if (!(event->attach_state & PERF_ATTACH_TASK))
+ break;
+ /* fall-through */
+ default:
return -ENOENT;

return 0;
@@ -5058,8 +5065,16 @@ static int task_clock_event_init(struct perf_event *event)
if (event->attr.type != PERF_TYPE_SOFTWARE)
return -ENOENT;

- if (event->attr.config != PERF_COUNT_SW_TASK_CLOCK)
+ switch (event->attr.config) {
+ case PERF_COUNT_SW_TASK_CLOCK:
+ break;
+ case PERF_COUNT_SW_CLOCK:
+ if (event->attach_state & PERF_ATTACH_TASK)
+ break;
+ /* fall-through */
+ default:
return -ENOENT;
+ }

return 0;
}


--
Franck
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/