Re: [PATCH v2] x86/resctrl: Clear the stale staged config after the configuration is completed

From: Shawn Wang
Date: Wed Oct 26 2022 - 07:04:10 EST


Hi Reinette,

On 10/26/2022 3:34 AM, Reinette Chatre wrote:
Hi Shawn,

On 10/25/2022 8:30 AM, Shawn Wang wrote:
Hi Reinette,

On 10/25/2022 12:45 AM, Reinette Chatre wrote:
Hi Shawn,

On 10/23/2022 7:31 PM, Shawn Wang wrote:
On 10/22/2022 2:05 AM, Reinette Chatre wrote:

...

It may not be enough to just clear staged_config[] when
resctrl_arch_update_domains() exits. I think the fix needs to make
sure staged_config[] can be cleared where it is set.

The modification of staged_config[] comes from two paths:

Path 1:
rdtgroup_schemata_write() {
      ...
      rdtgroup_parse_resource()     // set staged_config[]
      ...
      resctrl_arch_update_domains()     // clear staged_config[]
      ...
}

Path 2:
rdtgroup_init_alloc() {
      ...
      rdtgroup_init_mba()/rdtgroup_init_cat()    // set staged_config[]
      ...
      resctrl_arch_update_domains()        // clear staged_config[]
      ...
}

If we clear staged_config[] in resctrl_arch_update_domains(), goto
statement for error handling between setting staged_config[] and
calling resctrl_arch_update_domains() will be ignored. This can still
remain the stale staged_config[].
ah - indeed. Thank you for catching that.


I think maybe it is better to put the clearing work where
rdtgroup_schemata_write() and rdtgroup_init_alloc() exit.


It may be more robust to let rdtgroup_init_alloc() follow
how rdtgroup_schemata_write() already ensures that it is
working with a clean state by clearing staged_config[] before
placing its staged config within.


I want to make sure, do you mean just ignore the stale value and
place the clearing work before staged_config[] is used? If so, maybe
the only thing the fix should do is to add memset() to
rdtgroup_init_alloc().>

No, let us not leave stale data lying around.

The idea is that the function calling resctrl_arch_update_domains() is
responsible for initializing staged_config[] correctly and completely.
To confirm, yes, the idea is to clear the staged_config[] in
rdtgroup_init_alloc() before resctrl_arch_update_domains() is called
to follow how it is currently done in rdtgroup_schemata_write().

But, as you indicate, by itself this would leave stale data lying around.

The solution that you suggested earlier, to put the clearing work where
rdtgroup_schemata_write() and rdtgroup_init_alloc() exit, is most logical.
That makes the code symmetrical in that staged_config[] is cleared
where it is initialized and no stale data is left lying around. What was
not clear to me is how this would look in the end. Were you planning to
keep the staged_config[] clearing within rdtgroup_schemata_write() but
not do so in rdtgroup_init_alloc()? rdtgroup_schemata_write() and
rdtgroup_init_alloc() has to follow the same pattern to reduce confusion.

So, to be more robust, how about:

/* Clear staged_config[] to make sure working from a clean slate */
resctrl_arch_update_domains()
/* Clear staged_config[] to not leave stale data lying around */


Thank you for your explanation, and it makes sense to me. But this will
require 4 memset() loops, how about putting the clearing work in
a separate function in rdtgroup.c, like rdt_last_cmd_clear():

Yes, thanks for avoiding duplicating code.


void staged_configs_clear(void) {
    struct resctrl_schema *s;
    struct rdt_domain *dom;

    lockdep_assert_held(&rdtgroup_mutex);

    list_for_each_entry(s, &resctrl_schema_all, list) {
        list_for_each_entry(dom, &s->res->domains, list)
            memset(dom->staged_config, 0, sizeof(dom->staged_config));
    }
}


I understand that you are just copying what is currently done in
rdtgroup_schemata_write() but for a separate function I think something
like below would be more efficient:


for_each_alloc_capable_rdt_resource(r) {
list_for_each_entry(dom, &r->domains, list)
memset(dom->staged_config, 0, sizeof(dom->staged_config));
}

This would be more efficient since it would not clean the same memory area
twice when CDP is enabled.


Thank you, I didn't notice this function before. I will add it in a new version.

Shawn