Re: [PATCH 0/3] x86/resctrl: Fix a few issues in moving a task to a resource group

From: Reinette Chatre
Date: Wed Dec 16 2020 - 13:27:41 EST

Next message: Reinette Chatre: "Re: [PATCH 2/3] x86/resctrl: Update PQR_ASSOC MSR synchronously when moving task to resource group"
Previous message: Matthew Wilcox (Oracle): "[PATCH 16/25] mm: Add folio_mapping"
In reply to: Valentin Schneider: "Re: [PATCH 0/3] x86/resctrl: Fix a few issues in moving a task to a resource group"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Valentin,

On 12/16/2020 9:41 AM, Valentin Schneider wrote:

On 14/12/20 18:38, Reinette Chatre wrote:

Thinking a bit more (too much?) about it, we could limit ourselves to
wrapping only reads not protected by the rdtgroup_mutex: the only two
task_struct {closid, rmid} writers are
- rdtgroup_move_task()
- rdt_move_group_tasks()
and they are both invoked while holding said mutex. Thus, a reader holding
the mutex cannot race with a write, so load tearing ought to be safe.

The reads that are not protected by the rdtgroup_mutex can be found in
__resctrl_sched_in(). It thus sounds to me like your proposed changes to
this function found in your patch [1] is what is needed?

Right.

It is not clear
to me how the pairing would work in this case though. If I understand
correctly the goal is for the write to the closid/rmid in the functions
you mention above to be paired with the reads in resctrl_sched_in() and
it is not clear how adding a single READ_ONCE would accomplish this
pairing by itself.

So all the writes would need WRITE_ONCE(), but not all reads would require
a READ_ONCE() (those that can't race with writes shouldn't need them).

Got it. I misunderstood your previous response, mistakenly thinking that it stated that there would be only READ_ONCE() reads without being paired with WRITE_ONCE() writes. Thanks for clearing that up.

I'll go and update that patch so that you can bundle it with v2 of this
series.

Thank you so much.

It is also not entirely clear to me what the problematic scenario could
be. If I understand correctly, the risk is (as you explained in your
commit message), that a CPU could have its {closid, rmid} fields read
locally (resctrl_sched_in()) while they are concurrently being written
to from another CPU (in rdtgroup_move_task() and rdt_move_group_tasks()
as you state above). If this happens then a task being moved may be
scheduled in with its old closid/rmid.

Worse, it may be scheduled with a mangled closid/rmid if the read in
resctrl_sched_in() is torn (i.e. compiled as a sequence of multiple
smaller-sized loads). This one of the things READ_ONCE() / WRITE_ONCE()
try to address.

I see. This area is unfamiliar to me, thank you very much for catching this and helping to get it right.

The update of closid/rmid in
rdtgroup_move_task()/rdt_move_group_tasks() is followed by
smp_call_function_xx() where the registers are updated with preemption
disabled and thus protected against __switch_to. If a task was thus
incorrectly scheduled in with old closid/rmid, would it not be corrected
at this point?

Excluding load/store tearing, then yes, the above works fine.

Thank you for this sanity check.

Reinette

Next message: Reinette Chatre: "Re: [PATCH 2/3] x86/resctrl: Update PQR_ASSOC MSR synchronously when moving task to resource group"
Previous message: Matthew Wilcox (Oracle): "[PATCH 16/25] mm: Add folio_mapping"
In reply to: Valentin Schneider: "Re: [PATCH 0/3] x86/resctrl: Fix a few issues in moving a task to a resource group"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]