Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount

From: Thomas Gleixner
Date: Sat Nov 26 2016 - 04:11:52 EST

Next message: Vitaly Wool: "Re: [PATCH 2/3] z3fold: don't fail kernel build if z3fold_header is too big"
Previous message: Vitaly Wool: "Re: [PATH] z3fold: extend compaction function"
In reply to: Fenghua Yu: "Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount"
Next in thread: Fenghua Yu: "Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 25 Nov 2016, Fenghua Yu wrote:
> On Wed, Nov 23, 2016 at 03:23:50PM +0100, Thomas Gleixner wrote:
> > +#ifdef CONFIG_SMP
> > + /*
> > + * This is safe on x86 w/o barriers as the ordering
> > + * of writing to task_cpu() and t->on_cpu is
> > + * reverse to the reading here. The detection is
> > + * inaccurate as tasks might move or schedule
> > + * before the smp function call takes place. In
> > + * such a case the function call is pointless, but
> > + * there is no other side effect.
> > + */
>
> If process p1 is running on CPU1 before this point,
>
> > + if (mask && t->on_cpu)
> > + cpumask_set_cpu(task_cpu(t), mask);
>
> If between CPU1 is set in mask and rdt_update_closid(tmpmask, NULL) is
> called, p1 is switched to CPU2, and process p2 with its own closid
> (e.g. 2) is switched to CPU1.
>
> Then closid in PQR_ASSOC is set incorrectly as 0 instead of 2 on CPU1.
> 0 may stay in PQR_ASSOC until next context switch which may take long time
> in cases of real time or HPC.
>
> Don't we need to care this situation? In this situation, the function call
> is not "pointless" but it's wrong, right?

No.

CPU0 CPU1 CPU2
T1 (closid 0) T2 (closid 2)

(t1->on_cpu)
set(1, mask)
preemption
T1 ->CPU2
switch_to(T3) preemption
switch_to(idle)
T2 -> CPU1
switch_to(T2) switch_to(T1)
intel_rdt_sched_in() intel_rdt_sched_in()
closid = T2->closid closid = T1->closid
closid =2 closid = CPU2->closid

rdt_update_closid(mask)

rdt_update_cpu_closid()
intel_rdt_sched_in()
closid = T2->closid
closid = 2

IOW, whatever comes first, sched_switch() or function call will update the
closid to the correct value.

If CPU2 was in the removed group then this looks the following way:

CPU0 CPU1 CPU2
T1 (closid 0) T2 (closid 2)

(t1->on_cpu)
set(1, mask)
preemption
T1 ->CPU2
switch_to(T3) preemption
switch_to(idle)
T2 -> CPU1
switch_to(T2) switch_to(T1)
intel_rdt_sched_in() intel_rdt_sched_in()
closid = T2->closid closid = T1->closid (0)
closid =2 closid = CPU2->closid
closid = 5
for_each_cpu(grp->mask)
CPU2->closid = 0

rdt_update_closid(mask)

rdt_update_cpu_closid() rdt_update_cpu_closid()
intel_rdt_sched_in( intel_rdt_sched_in()
closid = T2->closid closid = T1->closid (0)
closid = 2 closid = CPU2->closid
closid = 0

But on CPU2 the function call might be pointless as well in the following
situation:

CPU0 CPU1 CPU2
T1 (closid 0) T2 (closid 2)

(t1->on_cpu)
set(1, mask)
preemption
T1 ->CPU2
switch_to(T3) preemption
switch_to(idle)

for_each_cpu(grp->mask)
CPU2->closid = 0
T2 -> CPU1
switch_to(T2) switch_to(T1)
intel_rdt_sched_in() intel_rdt_sched_in()
closid = T2->closid closid = T1->closid (0)
closid =2 closid = CPU2->closid
closid = 0

rdt_update_closid(mask)

rdt_update_cpu_closid() rdt_update_cpu_closid()
intel_rdt_sched_in( intel_rdt_sched_in()
closid = T2->closid closid = T1->closid (0)
closid = 2 closid = CPU2->closid
closid = 0

The whole thing works by ordering:

1) Update closids of each task in the group and if a task is running on a
cpu then mark the CPU on which the task is running for the function
call.

2) Update closids of each CPU in the group

3) Or the cpumasks of the tasks and the groups and invoke the function call
on all of them

If an affected task does a sched_switch after task->closid is updated and
before the function call is invoked then the function call is pointless.

If a sched switch happens on a CPU after cpu->closid is updated and before
the function call is invoked then the function call is pointless.

But there is no case where the function call can result in a wrong value.

Thanks,

tglx

Next message: Vitaly Wool: "Re: [PATCH 2/3] z3fold: don't fail kernel build if z3fold_header is too big"
Previous message: Vitaly Wool: "Re: [PATH] z3fold: extend compaction function"
In reply to: Fenghua Yu: "Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount"
Next in thread: Fenghua Yu: "Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]