On 14/12/20 18:38, Reinette Chatre wrote:
Thinking a bit more (too much?) about it, we could limit ourselves to
wrapping only reads not protected by the rdtgroup_mutex: the only two
task_struct {closid, rmid} writers are
- rdtgroup_move_task()
- rdt_move_group_tasks()
and they are both invoked while holding said mutex. Thus, a reader holding
the mutex cannot race with a write, so load tearing ought to be safe.
The reads that are not protected by the rdtgroup_mutex can be found in
__resctrl_sched_in(). It thus sounds to me like your proposed changes to
this function found in your patch [1] is what is needed?
Right.
It is not clear
to me how the pairing would work in this case though. If I understand
correctly the goal is for the write to the closid/rmid in the functions
you mention above to be paired with the reads in resctrl_sched_in() and
it is not clear how adding a single READ_ONCE would accomplish this
pairing by itself.
So all the writes would need WRITE_ONCE(), but not all reads would require
a READ_ONCE() (those that can't race with writes shouldn't need them).
I'll go and update that patch so that you can bundle it with v2 of this
series.
It is also not entirely clear to me what the problematic scenario could
be. If I understand correctly, the risk is (as you explained in your
commit message), that a CPU could have its {closid, rmid} fields read
locally (resctrl_sched_in()) while they are concurrently being written
to from another CPU (in rdtgroup_move_task() and rdt_move_group_tasks()
as you state above). If this happens then a task being moved may be
scheduled in with its old closid/rmid.
Worse, it may be scheduled with a mangled closid/rmid if the read in
resctrl_sched_in() is torn (i.e. compiled as a sequence of multiple
smaller-sized loads). This one of the things READ_ONCE() / WRITE_ONCE()
try to address.
The update of closid/rmid in
rdtgroup_move_task()/rdt_move_group_tasks() is followed by
smp_call_function_xx() where the registers are updated with preemption
disabled and thus protected against __switch_to. If a task was thus
incorrectly scheduled in with old closid/rmid, would it not be corrected
at this point?
Excluding load/store tearing, then yes, the above works fine.