Re: [PATCH v4 07/21] x86/resctrl: Create mba_sc configuration in the rdt_domain

From: James Morse
Date: Tue Jun 07 2022 - 08:08:10 EST


Hi Reinette,

On 17/05/2022 17:18, Reinette Chatre wrote:
> On 4/12/2022 5:44 AM, James Morse wrote:
>> @@ -3263,6 +3295,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d)
>> cancel_delayed_work(&d->cqm_limbo);
>> }
>>
>> + mba_sc_domain_destroy(r, d);
>> domain_destroy_mon_state(d);
>> }
>
> It is not clear to me how rdt_domain->mbps_val will be released via the above call.
>
> After patch 3/21 and the hunk below resctrl_online_domain() would look like:

[..]

> If I understand the above correctly, if MBM is enabled then all domains
> of resource RDT_RESOURCE_MBA will have rdt_domain->mbps_val allocated via
> resctrl_online_domain().
>
> RDT_RESOURCE_MBA is not mon_capable,

Bother - this is part of the mistake I made with v3.
(in MPAM, all resources can be alloc_capable or mon_capable - this trips me up every time)


> so at the time its domains go
> offline, the freeing of rdt_domain->mbps_val will be skipped because
> after patch 5/21 resctrl_offline_domain() would look like below so
> I do not see how the hunk added above will ever end up cleaning up
> allocated memory:

Yup, I missed this when fixing the mistake you pointed out in v3.

I've changes this to have:
| if (supports_mba_mbps() && r->rid == RDT_RESOURCE_MBA)
| mba_sc_domain_destroy(r, d);

in resctrl_offline_domain().



>> @@ -3302,12 +3335,20 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
>>
>> lockdep_assert_held(&rdtgroup_mutex);
>>
>> + if (is_mbm_enabled() && r->rid == RDT_RESOURCE_MBA) {
>
> This introduces only half of the checks that are later replaced in
> patch 10 "x86/resctrl: Abstract and use supports_mba_mbps()". Could the
> full check be used here for that patch to be cleaner or perhaps patch 10
> could be moved to be before this patch?

Great idea.


>> + err = mba_sc_domain_allocate(r, d);
>> + if (err)
>> + return err;
>> + }
>> +
>> if (!r->mon_capable)
>> return 0;
>>
>> err = domain_setup_mon_state(r, d);
>> - if (err)
>> + if (err) {
>> + mba_sc_domain_destroy(r, d);
>> return err;
>> + }
>
> Cleaning up after the error is reasonable but this allocation would only
> ever happen if the resource is RDT_RESOURCE_MBA and it is not mon_capable.
> Something would thus have gone really wrong if this cleanup is necessary.
> Considering that only mon_capable resources are initialized at this point,
> why not just exit right after calling mba_sc_domain_allocate()?

I'm a little uncomfortable adding more places that hardcode "this resources is never
mon_capable", its something that has to be bodged around by MPAM where any resource can
have monitors.

But sure, this just needs looking at in more detail in the future.


Thanks,

James