Re: [PATCH 3/7] x86/intel_rdt: Add support for cache bit mask management
From: Thomas Gleixner
Date: Fri May 15 2015 - 15:25:38 EST
On Mon, 11 May 2015, Vikas Shivappa wrote:
> @@ -35,6 +36,42 @@ static struct rdt_subsys_info rdtss_info;
> static DEFINE_MUTEX(rdt_group_mutex);
> struct intel_rdt rdt_root_group;
>
> +/*
> + * Mask of CPUs for writing CBM values. We only need one per-socket.
One mask? One CPU? One what? Please add comments which are
understandable and useful for people who did NOT write that code.
> + */
> +static cpumask_t rdt_cpumask;
> +static void __cpu_cbm_update(void *info)
> +{
> + unsigned int closid = *((unsigned int *)info);
So all you want is to transfer closid, right? Why having a pointer
instead of just doing
on_each_cpu_mask(...., (void *) id);
and
id = (unsigned long) id;
Looks too simple and readable, right?
> +/*
> + * cbm_update_msrs() - Updates all the existing IA32_L3_MASK_n MSRs
> + * which are one per CLOSid, except IA32_L3_MASK_0 on the current package.
> + * @cpu : the cpu on which the mask is updated.
> + */
> +static inline void cbm_update_msrs(int cpu)
> +{
> + int maxid = boot_cpu_data.x86_rdt_max_closid;
> + unsigned int i;
> +
> + if (WARN_ON(cpu != smp_processor_id()))
> + return;
> +
> + for (i = 1; i < maxid; i++) {
Lacks a comment why this starts with 1 and not with 0 as one would expect.
> + if (ccmap[i].clos_refcnt)
> + __cpu_cbm_update(&i);
> + }
> +}
This is only used by the cpu hotplug code. So why does it not live
close to it? Because its more fun to search for functions than having
them just in place?
> +/*
> + * intel_cache_alloc_cbm_write() - Validates and writes the
> + * cache bit mask(cbm) to the IA32_L3_MASK_n
> + * and also store the same in the ccmap.
> + *
> + * CLOSids are reused for cgroups which have same bitmask.
> + * - This helps to use the scant CLOSids optimally.
> + * - This also implies that at context switch write
> + * to PQR-MSR is done only when a task with a
> + * different bitmask is scheduled in.
> + */
> +static int intel_cache_alloc_cbm_write(struct cgroup_subsys_state *css,
Can you please drop these intel_ prefixes? They provide no value and
just eat space.
> + struct cftype *cft, u64 cbmvalue)
> +{
> + u32 max_cbm = boot_cpu_data.x86_rdt_max_cbm_len;
> + struct intel_rdt *ir = css_rdt(css);
> + unsigned long cache_mask, max_mask;
> + unsigned long *cbm_tmp;
> + unsigned int closid;
> + ssize_t err = 0;
> +
> + if (ir == &rdt_root_group)
> + return -EPERM;
> + bitmap_set(&max_mask, 0, max_cbm);
> +
> + /*
> + * Need global mutex as cbm write may allocate a closid.
> + */
> + mutex_lock(&rdt_group_mutex);
> + bitmap_and(&cache_mask, (unsigned long *)&cbmvalue, &max_mask, max_cbm);
> + cbm_tmp = &ccmap[ir->clos].cache_mask;
> +
> + if (bitmap_equal(&cache_mask, cbm_tmp, MAX_CBM_LENGTH))
> + goto out;
> +
Sigh, what's the point of this bitmap shuffling? Nothing as far as I
can tell, because you first set all bits of maxmask and then AND
cbmvalue.
All you achieve is, that you truncate cbmvalue to max_cbm bits. So if
cbmvalue has any of the invalid bits set, you happily ignore that.
This whole bitmap usage is just making the code obfuscated.
u64 max_mask = (1ULL << max_cbm) - 1;
if (cbm_value & ~max_mask)
return -EINVAL;
if (cbm_value == (u64)ccmap[ir->clos].cache_mask)
return 0;
No bitmap_set, bitmap_and nothing. Simple and readable, right?
The bitmap would only be necessary, IF max_cbm is > 64. But that's not
the case and nothing in your code would survive ma_ cbm > 64. And as I
said before CBM_LEN is 32bit for now and I serioulsy doubt, that it
can ever go over 64bit simply because that would not fit into the mask
MSRs.
So can you please get rid of that bitmap nonsense including the
dynamic allocation and use simple and readable code constructs?
Just for the record:
struct bla {
u64 cbm_mask;
};
versus:
struct bla {
unsigned long *cbm_mask;
};
plus
bla.cbm_mask = kzalloc(sizeof(u64));
Is just utter waste of data and text space.
> + /*
> + * At this point we are sure to change the cache_mask.Hence release the
> + * reference to the current CLOSid and try to get a reference for
> + * a different CLOSid.
> + */
> + __clos_put(ir->clos);
So you drop the reference, before you make sure that you can get a new one?
> + if (cbm_search(cache_mask, &closid)) {
> + ir->clos = closid;
> + __clos_get(closid);
> + } else {
> + err = intel_rdt_alloc_closid(ir);
> + if (err)
> + goto out;
What happens if intel_rdt_alloc_closid() fails? ir->clos is unchanged,
but you have no reference anymore. The next call to this function or
anything else which calls __clos_put() will trigger the underflow
warning and the warning will be completely useless because the
wreckage happened before that.
> +static inline bool intel_rdt_update_cpumask(int cpu)
> +{
> + int phys_id = topology_physical_package_id(cpu);
> + struct cpumask *mask = &rdt_cpumask;
> + int i;
> +
> + for_each_cpu(i, mask) {
> + if (phys_id == topology_physical_package_id(i))
> + return false;
> + }
You must be kidding. You know how many packages are possible. What's
wrong with having a bitmap of physical packages which have a CPU
handling it?
id = topology_physical_package_id(cpu);
if (!__test_and_set_bit(id, &package_map)) {
cpumask_set_cpu(cpu, &rdt_cpumask);
cbm_update_msrs(cpu);
}
Four lines of code without loops and hoops. And here is a bitmap the
proper data structure.
> +static void intel_rdt_cpu_exit(unsigned int cpu)
> +{
> + int phys_id = topology_physical_package_id(cpu);
> + int i;
> +
> + mutex_lock(&rdt_group_mutex);
> + if (!cpumask_test_and_clear_cpu(cpu, &rdt_cpumask)) {
> + mutex_unlock(&rdt_group_mutex);
> + return;
> + }
> +
> + for_each_online_cpu(i) {
> + if (i == cpu)
> + continue;
> +
> + if (phys_id == topology_physical_package_id(i)) {
> + cpumask_set_cpu(i, &rdt_cpumask);
> + break;
> + }
> + }
There is a better way than iterating over all online cpus ....
>
> +static struct cftype rdt_files[] = {
> + {
> + .name = "cache_mask",
> + .seq_show = intel_cache_alloc_cbm_read,
> + .write_u64 = intel_cache_alloc_cbm_write,
> + .mode = 0666,
> + },
> + { } /* terminate */
> +};
Where is the Documentation for that file? What's the content, format,
error codes ....
Sigh.
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/