Re: [PATCH 08/21] x86/intel_rdt/cqm: Add RMID(Resource monitoring ID) management

From: Thomas Gleixner
Date: Sun Jul 02 2017 - 06:06:55 EST


On Mon, 26 Jun 2017, Vikas Shivappa wrote:
> +static u64 __rmid_read(u32 rmid, u32 eventid)
> +{
> + u64 val;
> +
> + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> + rdmsrl(MSR_IA32_QM_CTR, val);

The calling convention of this function needs to be documented. It's
obvious that it needs to be serialized ....

> +
> + /*
> + * Aside from the ERROR and UNAVAIL bits, the return value is the
> + * count for this @eventid tagged with @rmid.
> + */
> + return val;
> +}
> +
> +/*
> + * Test whether an RMID is dirty(occupancy > threshold_occupancy)
> + */
> +static void intel_cqm_stable(void *arg)
> +{
> + struct rmid_entry *entry;
> + u64 val;
> +
> + /*
> + * Since we are in the IPI already lets mark all the RMIDs
> + * that are dirty

This comment is crap. It suggests: Let's do it while we are here anyway.

But that's not true. The IPI is issued solely to figure out which RMIDs are
dirty.

> + */
> + list_for_each_entry(entry, &rmid_limbo_lru, list) {

Since this is executed on multiple CPUs, that needs an explanation why that
list is safe to iterate w/o explicit protection here.

> + val = __rmid_read(entry->rmid, QOS_L3_OCCUP_EVENT_ID);
> + if (val > intel_cqm_threshold)
> + entry->state = RMID_DIRTY;
> + }
> +}
> +
> +/*
> + * Scan the limbo list and move all entries that are below the
> + * intel_cqm_threshold to the free list.
> + * Return "true" if the limbo list is empty, "false" if there are
> + * still some RMIDs there.
> + */
> +static bool try_freeing_limbo_rmid(void)
> +{
> + struct rmid_entry *entry, *tmp;
> + struct rdt_resource *r;
> + cpumask_var_t cpu_mask;
> + struct rdt_domain *d;
> + bool ret = true;
> +
> + if (list_empty(&rmid_limbo_lru))
> + return ret;
> +
> + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
> + return false;
> +
> + r = &rdt_resources_all[RDT_RESOURCE_L3];
> +
> + list_for_each_entry(d, &r->domains, list)
> + cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
> +
> + /*
> + * Test whether an RMID is free for each package.

That wants a bit of explanation at some place why RMIDs have global
scope. That's a pure implementation decision because from a hardware POV
RMIDs have package scope. We could use the same RMID on different packages
for different purposes.

> + */
> + on_each_cpu_mask(cpu_mask, intel_cqm_stable, NULL, true);
> +
> + list_for_each_entry_safe(entry, tmp, &rmid_limbo_lru, list) {
> + /*
> + * Ignore the RMIDs that are marked dirty and reset the
> + * state to check for being dirty again later.

Ignore? -EMAKESNOSENSE

> + */
> + if (entry->state == RMID_DIRTY) {
> + entry->state = RMID_CHECK;
> + ret = false;
> + continue;
> + }
> + list_del(&entry->list);
> + list_add_tail(&entry->list, &rmid_free_lru);
> + }
> +
> + free_cpumask_var(cpu_mask);

...

> +void free_rmid(u32 rmid)
> +{
> + struct rmid_entry *entry;
> +
> + lockdep_assert_held(&rdtgroup_mutex);
> +
> + WARN_ON(!rmid);
> + entry = __rmid_entry(rmid);
> +
> + entry->state = RMID_CHECK;
> +
> + if (rdt_mon_features & (1 << QOS_L3_OCCUP_EVENT_ID))
> + list_add_tail(&entry->list, &rmid_limbo_lru);
> + else
> + list_add_tail(&entry->list, &rmid_free_lru);

Thinking a bit more about that limbo mechanics.

In case that a RMID was never used on a particular package, the state check
forces an IPI on all packages unconditionally. That's suboptimal at least.

We know on which package a given RMID was used, so we could restrict the
checks to exactly these packages, but I'm not sure it's worth the
trouble. We might at least document that and explain why this is
implemented in that way.

Thanks,

tglx