Re: [PATCH 5/7] x86/intel_rdt: Software Cache for IA32_PQR_MSR

From: Thomas Gleixner
Date: Fri May 15 2015 - 16:15:26 EST


On Mon, 11 May 2015, Vikas Shivappa wrote:

> Signed-off-by: Vikas Shivappa <vikas.shivappa@xxxxxxxxxxxxxxx>
>
> Conflicts:
> arch/x86/kernel/cpu/perf_event_intel_cqm.c

And that's interesting for what?

> --- /dev/null
> +++ b/arch/x86/include/asm/rdt_common.h

> @@ -0,0 +1,13 @@
> +#ifndef _X86_RDT_H_
> +#define _X86_RDT_H_
> +
> +#define MSR_IA32_PQR_ASSOC 0x0c8f
> +
> +struct intel_pqr_state {
> + raw_spinlock_t lock;
> + int rmid;
> + int clos;

Those want to be u32. We have no type checkes there, but u32 is the
proper data type for wrmsr.

> + int cnt;
> +};
> +

What's wrong with having this in intel_rdt.h? It seems you're a big
fan of sprinkling stuff all over the place so reading becomes a
chasing game.

> {
> struct task_struct *task = current;
> struct intel_rdt *ir;
> + struct intel_pqr_state *state = this_cpu_ptr(&pqr_state);
> + unsigned long flags;
>
> + raw_spin_lock_irqsave(&state->lock, flags);

finish_arch_switch() is called with interrupts disabled already ...

> rcu_read_lock();

So we now have a spin_lock() and rcu_read_lock() and no explanation
what is protecting what.

> ir = task_rdt(task);
> - if (ir->clos == clos) {
> + if (ir->clos == state->clos) {

And of course taking the spin lock for checking state->clos is
complete and utter nonsense.

state->clos can only be changed by this code and the only reason why
we need the lock is to protect against concurrent modification of
state->rmid.

So the check for ir->clos == state->clos can be done lockless.

And I seriously doubt, that state->lock is necessary at all.

Q: What is it protecting?
A: state->clos, state->rmid, state->cnt

Q: What's the context?
A: Per cpu context. The per cpu pqr state is NEVER modified from a
remote cpu.

Q: What is the lock for?
A: Nothing.

Q: Why?
A: Because interrupt disabled regions protect per cpu state perfectly
fine and there is is no memory ordering issue which would require a
lock or barrier either.

Peter explained it to you several times already that context switch is
one the most sensitive hot pathes where we care about every cycle. But
you just go ahead and mindlessly create pointless overhead.

> + /*
> + * PQR has closid in high 32 bits and CQM-RMID
> + * in low 10 bits. Rewrite the exsting rmid from
> + * software cache.

This comment wouldnt be necessary if you would have proper documented
struct pqr_state.

> + */
> + wrmsr(MSR_IA32_PQR_ASSOC, state->rmid, ir->clos);
> + state->clos = ir->clos;
> rcu_read_unlock();
> + raw_spin_unlock_irqrestore(&state->lock, flags);
> +
> }

> -static DEFINE_PER_CPU(struct intel_cqm_state, cqm_state);
> +DEFINE_PER_CPU(struct intel_pqr_state, pqr_state);

With CONFIG_PERF=n and CONFIG_CGROUP_RDT=y the linker will fail.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/