Re: [patch 02/24] perfmon: base code

From: stephane eranian
Date: Thu Nov 27 2008 - 12:48:17 EST


Thomas,



On Thu, Nov 27, 2008 at 6:24 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Wed, 26 Nov 2008, eranian@xxxxxxxxxxxxxx wrote:
>
>> Index: o3/perfmon/perfmon_res.c
>
>> +/*
>> + * global information about all sessions
>> + */
>> +struct pfm_resources {
>> + cpumask_t sys_cpumask; /* bitmask of used cpus */
>> + u32 thread_sessions; /* #num loaded per-thread sessions */
>> +};
>
> What's the purpose of this being a structure if it's just a single
> instance ?
>
There is a single instance.
I was just trying to regroup related fields together instead of having them as
separate variables. I can make the change.


>> +static struct pfm_resources pfm_res;
>> +
>> +static __cacheline_aligned_in_smp DEFINE_SPINLOCK(pfm_res_lock);
>
>> +/**
>> + * pfm_session_acquire - reserve a per-thread session
>> + *
>> + * return:
>> + * 0 : success
>> + * -EBUSY: if conflicting session exist
>
> Where ?

Not in the patchset, conflict can arise when you add system-wide sessions.

>
>> + */
>> +int pfm_session_acquire(void)
>> +{
>> + unsigned long flags;
>> + int ret = 0;
>> +
>> + /*
>> + * validy checks on cpu_mask have been done upstream
>> + */
>
> How please ? pfm_res.sys_cpumask is local to this file and you want
> to check it under the lock and _before_ you increment
> thread_sessions blindly.
>
Stale comment.

>
> +EXPORT_SYMBOL(pfm_session_allcpus_release);
>
> All what that code should do (in fact it does not) is preventing the
> mix of thread and system wide sessions.
>
True. That is a current limitation.

> It does neither need a cpumask nor tons of useless loops and debug
> outputs with zero value.
>
Well, the the cpumask is indeed needed but you don't see the reason
why in the patchset!

Perfmon in system-wide does not operate like Oprofile. In system-wide
a perfmon session (context) monitors only ONE CPU at a time. Each
session is independent of each other. You can therefore measure different
things on different CPUs. Reservation is thus done independently for each
CPU, therefore we need a cpu bitmask to track allocation.

The Oprofile reservation you see is built on top of the cpumask reservation.
It tries to allocate in one call and atomically ALL the CPUs as this is the way
Oprofile operates. Thus it fails if one perfmon system-wide session or one
perfmon per-thread exists. I believe there is enough bitmask functions to
avoid loops as you pointed out. I will fix that.


> static int global_session;
> static int thread_sessions;
> static DEFINE_SPINLOCK(session_lock);
>
> int pfm_session_request(int global)
> {
> unsigned long flags;
> int res = -EBUSY;
>
> spin_lock_irqsave(&session_lock, flags);
>
> if (!global && !global_session) {
> thread_sessions++;
> res = 0;
> }
>
> if (global && !thread_sessions && !global_session) {
> global_session = 1;
> res = 0;
> }
>
> spin_unlock_irqrestore(&session_lock, flags);
> return res;
> }
>
> void pfm_session_release(int global)
> {
> unsigned long flags;
>
> spin_lock_irqsave(&session_lock, flags);
>
> if (global) {
> WARN_ON(!global_session);
> global_session = 0;
> } else {
> if (!global_session && thread_sessions)
> thread_session--;
> else
> WARN();
> }
>
> spin_unlock_irqrestore(&session_lock, flags);
> }
>
> Would do it nicely including useful sanity checks and 70% less code.
>
> Thanks,
>
> tglx
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/