Re: [PATCH v4 1/2] x86/resctrl: Update task closid/rmid with task_call_func()

From: Reinette Chatre
Date: Tue Dec 13 2022 - 13:35:00 EST


Hi Peter,

On 12/12/2022 9:36 AM, Peter Newman wrote:
> On Sat, Dec 10, 2022 at 12:54 AM Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:
>> On 12/8/2022 2:30 PM, Peter Newman wrote:
>>> Based on this, I'll just sketch out the first scenario below and drop
>>> (2) from the changelog. This also implies that the group update cases
>>
>> ok, thank you for doing that analysis.
>>
>>> can use a single smp_mb() to provide all the necessary ordering, because
>>> there's a full barrier on context switch for it to pair with, so I don't
>>> need to broadcast IPI anymore.  I don't know whether task_call_func() is
>>
>> This is not clear to me because rdt_move_group_tasks() seems to have the
>> same code as shown below as vulnerable to re-ordering. Only difference
>> is that it uses the "//false" checks to set a bit in the cpumask for a
>> later IPI instead of an immediate IPI.
>
> An smp_mb() between writing the new task_struct::{closid,rmid} and
> calling task_curr() would prevent the reordering I described, but I
> was worried about the cost of executing a full barrier for every
> matching task.

So for moving a single task the solution may just be to change
the current barrier() to smp_mb()?

>
> I tried something like this:
>
> for_each_process_thread(p, t) {
> if (!from || is_closid_match(t, from) ||
> is_rmid_match(t, from)) {
> WRITE_ONCE(t->closid, to->closid);
> WRITE_ONCE(t->rmid, to->mon.rmid);
> /* group moves are serialized by rdt */
> t->resctrl_dirty = true;
> }
> }
> if (IS_ENABLED(CONFIG_SMP) && mask) {
> /* Order t->{closid,rmid} stores before loads in task_curr() */
> smp_mb();
> for_each_process_thread(p, t) {
> if (t->resctrl_dirty) {
> if (task_curr(t))
> cpumask_set_cpu(task_cpu(t), mask);
> t->resctrl_dirty = false;
> }
> }
> }
>

struct task_struct would not welcome a new member dedicated to resctrl's
rare use for convenience. Another option may be to use a flag within
the variables themselves but that seems like significant complication
(flag need to be dealt with during scheduling) for which the benefit
is not clear to me. I would prefer that we start with the simplest
solution first (I see that as IPI to all CPUs). The changelog needs clear
description of the problem needing to be solved and the solution chosen, noting
the tradeoffs with other possible solutions. You can submit that, as an RFC
if the "IPI to all CPUs" remains a concern, after which we can bring that
submission to the attention of the experts who would have needed information then
to point us in the right direction.

Reinette