Re: [PATCH v6 2/2] ThunderX2: Add Cavium ThunderX2 SoC UNCORE PMU driver

From: Suzuki K Poulose
Date: Thu Oct 11 2018 - 13:12:40 EST


Hi Ganpatrao,

On 11/10/18 17:06, Ganapatrao Kulkarni wrote:
On Thu, Oct 11, 2018 at 2:43 PM Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:

Hi Ganapatrao,

On 11/10/18 07:39, Ganapatrao Kulkarni wrote:
+
+/*
+ * We must NOT create groups containing events from multiple hardware PMUs,
+ * although mixing different software and hardware PMUs is allowed.
+ */
+static bool thunderx2_uncore_validate_event_group(struct perf_event *event)
+{
+ struct pmu *pmu = event->pmu;
+ struct perf_event *leader = event->group_leader;
+ struct perf_event *sibling;
+ int counters = 0;
+
+ if (leader->pmu != event->pmu && !is_software_event(leader))
+ return false;
+
+ for_each_sibling_event(sibling, event->group_leader) {
+ if (is_software_event(sibling))
+ continue;
+ if (sibling->pmu != pmu)
+ return false;
+ counters++;
+ }

The above loop will not account for :
1) The leader event
2) The event that we are about to add.

So you need to allocate counters for both the above to make sure we can
accommodate the events.

Is leader also part of sibling list?

No.

not sure, what you are expecting here,


The purpose of the validate_group is to ensure if a group of events can
be scheduled together. That implies, if the new event is added to the
group, do we have enough counters to profile. So, you have :

1 (leader) + number-of-siblings-in-the-group + 1(this_new) events

But you only account for the number-of-siblings.

this function is inline with other perf driver code added in driver/perf

No, it is not. arm-cci, arm-dsu-pmu, arm-pmu, all do separate accounting
for leader, siblings and the "new event".

Are we guaranteed that the counter0 is always allocated ? What if the
event is deleted and there are other events active ? A better check

timer will be active/restarted, till last bit is cleared( till last
counter is released).

Ok, so you kick off the timer when you start the first active event,
which is assumed to be on the counter0 and the timeout handler would
restart the counters when at least one event is active, as you don't
have a PMU pmu_{enable/disable} call back. But I am still not convinced
that assuming the "first counter" is always 0. I would rather use:
bitmap_weight() == 1 check.

would be :
if (find_last_bit(...) != pmu_uncore->uncore_dev->max_counters)

IIUC, find_last_bit will return zero if none of the counters are
active and we want to start timer for first active counter.


Well, if that was the case, how do you know if the bit zero is the only
bit set ?

AFAICS, lib/find_bit.c has :

unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
{
if (size) {
unsigned long val = BITMAP_LAST_WORD_MASK(size);
unsigned long idx = (size-1) / BITS_PER_LONG;

do {
val &= addr[idx];
if (val)
return idx * BITS_PER_LONG + __fls(val);

val = ~0ul;
} while (idx--);
}
return size;
}


It returns "size" when it can't find a set bit.

before start, alloc_counter is called to set, it seems we never hit
case where at-least one bit is not set.
timer will be started for first bit set and will continue till all
bits are cleared.



+static enum hrtimer_restart thunderx2_uncore_hrtimer_callback(
+ struct hrtimer *hrt)
+{
+ struct thunderx2_pmu_uncore_channel *pmu_uncore;
+ unsigned long irqflags;
+ int idx;
+ bool restart_timer = false;
+
+ pmu_uncore = container_of(hrt, struct thunderx2_pmu_uncore_channel,
+ hrtimer);
+
+ raw_spin_lock_irqsave(&pmu_uncore->uncore_dev->lock, irqflags);
+ for_each_set_bit(idx, pmu_uncore->active_counters,
+ pmu_uncore->uncore_dev->max_counters) {
+ struct perf_event *event = pmu_uncore->events[idx];

Do we need to check if the "event" is not NULL ? Couldn't this race with
an event_del() operation ?

i dont think so, i have not come across any such race condition during
my testing.

Thinking about it further, we may not hit it, as the PMU is "disable"d,
before "add/del", which implies that the IRQ won't be triggered.

Taking another look, since you don't have pmu_disable/enable callbacks,
the events could still be running. However, you are protected by the
free_counter() which holds the uncore_dev->lock to prevent the race.


Btw, in general, not hitting a race condition doesn't prove there cannot
be a race ;)

want to avoid, unnecessary defensive programming.

Of course.

Suzuki