Re: [PATCH v2] x86/resctrl: Clear the stale staged config after the configuration is completed
From: Shawn Wang
Date: Fri Oct 21 2022 - 04:23:26 EST
Hi Reinette,
On 10/21/2022 12:35 AM, Reinette Chatre wrote:
...
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 1dafbdc5ac31..2c719da5544f 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -338,6 +338,8 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
msr_param.high = max(msr_param.high, idx + 1);
}
}
+ /* Clear the stale staged config */
+ memset(d->staged_config, 0, sizeof(d->staged_config));
}
if (cpumask_empty(cpu_mask))
Please also ensure that the temporary storage is cleared if there is an
early exist because of failure. Please do not duplicate the memset() code
but instead move it to a common exit location.
There are two different resctrl_arch_update_domains() function call paths:
1.rdtgroup_mkdir()->rdtgroup_mkdir_ctrl_mon()->rdtgroup_init_alloc()->resctrl_arch_update_domains()
2.rdtgroup_schemata_write()->resctrl_arch_update_domains()
Perhaps there is no common exit location if we want to clear staged_config[] after every call of resctrl_arch_update_domains().
I was referring to a common exit out of resctrl_arch_update_domains().
Look at how resctrl_arch_update_domains() behaves with this change:
resctrl_arch_update_domains()
{
...
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
...
list_for_each_entry(d, &r->domains, list) {
...
memset(d->staged_config, 0, sizeof(d->staged_config));
}
...
done:
free_cpumask_var(cpu_mask);
return 0;
}
The goal of this fix is to ensure that staged_config[] is cleared on
return from resctrl_arch_update_domains() so that there is no stale
data in staged_config[] when resctrl_arch_update_domains() is called
again.
Considering this, I can see two scenarios in the above solution where
staged_config[] is not cleared on exit from resctrl_arch_update_domains():
It may not be enough to just clear staged_config[] when
resctrl_arch_update_domains() exits. I think the fix needs to make sure
staged_config[] can be cleared where it is set.
The modification of staged_config[] comes from two paths:
Path 1:
rdtgroup_schemata_write() {
...
rdtgroup_parse_resource() // set staged_config[]
...
resctrl_arch_update_domains() // clear staged_config[]
...
}
Path 2:
rdtgroup_init_alloc() {
...
rdtgroup_init_mba()/rdtgroup_init_cat() // set staged_config[]
...
resctrl_arch_update_domains() // clear staged_config[]
...
}
If we clear staged_config[] in resctrl_arch_update_domains(), goto
statement for error handling between setting staged_config[] and calling
resctrl_arch_update_domains() will be ignored. This can still remain the
stale staged_config[].
I think maybe it is better to put the clearing work where
rdtgroup_schemata_write() and rdtgroup_init_alloc() exit.
(Sorry, I mistakenly wrote rdtgroup_init_alloc() to
rdtgroup_mkdir_ctrl_mon() in my last reply.)
Thank you,
Shawn