Re: [rfc 1/3] perf, x86: P4 PMU - describe config format

From: Stephane Eranian
Date: Fri Nov 26 2010 - 11:23:03 EST


On Fri, Nov 26, 2010 at 4:27 PM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> On Fri, Nov 26, 2010 at 02:54:39PM +0100, Stephane Eranian wrote:
> ...
>> >
>> > No, single thread mode means _any_ single thread is running,
>> > Stephane I'll describe some more a bit later (as only reach home),
>> > ok?
>>
>> From the manual:
>>
>> 00 â None. Count only when neither logical processor is active.
>> 01 â Single. Count only when one logical processor is active (either 0 or 1).
>> 10 â Both. Count only when both logical processors are active.
>> 11 â Any. Count when either logical processor is active.
>>
>> In per-thread mode, you won't hit 00. I suspect you want to
>> disallow 01, 10 (or CAP_SYS_ADMIN). Otherwise, you want
>> to force 11, i.e., can't figure out what's going on in the other
>> HT thread.
>>
>
> ÂNo ;) The key moment here that this flags are related to _activity_ of
> logical thread and I guess they were introduced just to allow measuring
> if user-space application does win from using HT or not (since for
> some loads the HT simply drops the perfomance).
>

I think what they call 'logical CPU' is what the kernel calls CPU.
So I think bits 16-17 are used if you want to measure on CPU0 only
when CPU1 (assume both share the same physical core) is active
or inactive or don't care. You're right that I believe this mode was
introduced to measure the level of concurrency between HT
thread (logical CPUs).

In architectural perfmon the .any modifier is slightly different.
It indicates whether you want to measure only yourself or both
threads (regardless of the state of the other HT thread). In other
words, it is not because .any=1 that the event counts ONLY when
both threads (logical CPUs) are active.



> ÂBut I guess what you have in mind is actually set in ESCR register --
> flags T0/1_USR, T0/1_OS. And these bits are controlled by kernel and

That's different, yes.

> "measurement" of events happening on another thread is simply not
> allowed, though you still can set on which CPL level measure the event
> by 'exclude_kernel','exclude_user' config attributes.
>
But CPL is orthogonal to CPUs.

> ÂThough there are still events which are "shared" across threads,
> so such events will need CAP_SYS_ADMIN permission.
>
That's a different event category. I think this is yet a different problem.

Bit 16-17 apply to any event.

> ÂHere is what I've put in comments while were touching this code.
>
> Â Â Â Â/*
> Â Â Â Â * NOTE: P4_CCCR_THREAD_ANY has not the same meaning as
> Â Â Â Â * in Architectural Performance Monitoring, it means not
> Â Â Â Â * on _which_ logical cpu to count but rather _when_, ie it
> Â Â Â Â * depends on logical cpu state -- count event if one cpu active,
> Â Â Â Â * none, both or any, so we just allow user to pass any value
> Â Â Â Â * desired.
> Â Â Â Â *
> Â Â Â Â * In turn we always set Tx_OS/Tx_USR bits bound to logical
> Â Â Â Â * cpu without their propagation to another cpu
> Â Â Â Â */
>
> Â Â Â Â/*
> Â Â Â Â * if an event is shared accross the logical threads
> Â Â Â Â * the user needs special permissions to be able to use it
> Â Â Â Â */
> Â Â Â Âif (p4_event_bind_map[v].shared) {
> Â Â Â Â Â Â Â Âif (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> Â Â Â Â Â Â Â Â Â Â Â Âreturn -EACCES;
> Â Â Â Â}
>
> ÂCyrill
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/