Re: [PATCH 2/3] x86/resctrl: Update PQR_ASSOC MSR synchronously when moving task to resource group

From: Valentin Schneider
Date: Fri Dec 11 2020 - 15:47:44 EST



On 03/12/20 23:25, Reinette Chatre wrote:
> Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
> Reported-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
> Reported-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx

Some pedantic comments below; with James' task_curr() + task_cpu()
suggestion:

Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>

> ---
> static int __rdtgroup_move_task(struct task_struct *tsk,
> struct rdtgroup *rdtgrp)
> {
> - struct task_move_callback *callback;
> - int ret;
> -
> - callback = kzalloc(sizeof(*callback), GFP_KERNEL);
> - if (!callback)
> - return -ENOMEM;
> - callback->work.func = move_myself;
> - callback->rdtgrp = rdtgrp;
> -
> /*
> - * Take a refcount, so rdtgrp cannot be freed before the
> - * callback has been invoked.
> + * Set the task's closid/rmid before the PQR_ASSOC MSR can be
> + * updated by them.
> + *
> + * For ctrl_mon groups, move both closid and rmid.
> + * For monitor groups, can move the tasks only from
> + * their parent CTRL group.
> */
> - atomic_inc(&rdtgrp->waitcount);
> - ret = task_work_add(tsk, &callback->work, TWA_RESUME);
> - if (ret) {
> - /*
> - * Task is exiting. Drop the refcount and free the callback.
> - * No need to check the refcount as the group cannot be
> - * deleted before the write function unlocks rdtgroup_mutex.
> - */
> - atomic_dec(&rdtgrp->waitcount);
> - kfree(callback);
> - rdt_last_cmd_puts("Task exited\n");
> - } else {
> - /*
> - * For ctrl_mon groups move both closid and rmid.
> - * For monitor groups, can move the tasks only from
> - * their parent CTRL group.
> - */
> - if (rdtgrp->type == RDTCTRL_GROUP) {
> - tsk->closid = rdtgrp->closid;
> +
> + if (rdtgrp->type == RDTCTRL_GROUP) {
> + tsk->closid = rdtgrp->closid;
> + tsk->rmid = rdtgrp->mon.rmid;
> + } else if (rdtgrp->type == RDTMON_GROUP) {
> + if (rdtgrp->mon.parent->closid == tsk->closid) {
> tsk->rmid = rdtgrp->mon.rmid;
> - } else if (rdtgrp->type == RDTMON_GROUP) {
> - if (rdtgrp->mon.parent->closid == tsk->closid) {
> - tsk->rmid = rdtgrp->mon.rmid;
> - } else {
> - rdt_last_cmd_puts("Can't move task to different control group\n");
> - ret = -EINVAL;
> - }
> + } else {
> + rdt_last_cmd_puts("Can't move task to different control group\n");
> + return -EINVAL;
> }
> + } else {
> + rdt_last_cmd_puts("Invalid resource group type\n");
> + return -EINVAL;
> }

James already pointed out this should be a WARN_ON_ONCE(), but is that the
right place to assert rdtgrp->type validity?

I see only a single assignment to rdtgrp->type in mkdir_rdt_prepare();
could we fail the group creation there instead if the passed rtype is
invalid?

> - return ret;
> +
> + /*
> + * By now, the task's closid and rmid are set. If the task is current
> + * on a CPU, the PQR_ASSOC MSR needs to be updated to make the resource
> + * group go into effect. If the task is not current, the MSR will be
> + * updated when the task is scheduled in.
> + */
> + update_task_closid_rmid(tsk);

We need the above writes to be compile-ordered before the IPI is sent.
There *is* a preempt_disable() down in smp_call_function_single() that
gives us the required barrier(), can we deem that sufficient or would we
want one before update_task_closid_rmid() for the sake of clarity?

> +
> + return 0;
> }
>
> static bool is_closid_match(struct task_struct *t, struct rdtgroup *r)