Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

From: Reinette Chatre
Date: Wed Dec 04 2024 - 14:28:16 EST


Hi Fenghua,

On 12/3/24 8:16 PM, Fenghua Yu wrote:
> Hi, Babu,
>
> On 10/29/24 16:21, Babu Moger wrote:
>> Assign/unassign counters on resctrl group creation/deletion. Two counters
>> are required per group, one for MBM total event and one for MBM local
>> event.
>>
>> There are a limited number of counters available for assignment. If these
>> counters are exhausted, the kernel will display the error message: "Out of
>> MBM assignable counters". However, it is not necessary to fail the
>> creation of a group due to assignment failures. Users have the flexibility
>> to modify the assignments at a later time.
>>
>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>> ---
>> v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
>>      Updated couple of rdtgroup_unassign_cntrs() calls properly.
>>      Updated function comments.
>>
>> v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
>>      Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
>>      Fixed the problem with unassigning the child MON groups of CTRL_MON group.
>>
>> v7: Reworded the commit message.
>>      Removed the reference of ABMC with mbm_cntr_assign.
>>      Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
>>
>> v6: Removed the redundant comments on all the calls of
>>      rdtgroup_assign_cntrs. Updated the commit message.
>>      Dropped printing error message on every call of rdtgroup_assign_cntrs.
>>
>> v5: Removed the code to enable/disable ABMC during the mount.
>>      That will be another patch.
>>      Added arch callers to get the arch specific data.
>>      Renamed fuctions to match the other abmc function.
>>      Added code comments for assignment failures.
>>
>> v4: Few name changes based on the upstream discussion.
>>      Commit message update.
>>
>> v3: This is a new patch. Patch addresses the upstream comment to enable
>>      ABMC feature by default if the feature is available.
>> ---
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
>>   1 file changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index b0cce3dfd062..a8d21b0b2054 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
>>       }
>>   }
>>   +/*
>> + * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
>> + * counters are automatically assigned. Each group can accommodate two counters:
>> + * one for the total event and one for the local event. Assignments may fail
>> + * due to the limited number of counters. However, it is not necessary to fail
>> + * the group creation and thus no failure is returned. Users have the option
>> + * to modify the counter assignments after the group has been created.
>> + */
>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> +    struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> +    if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>> +        return;
>> +
>> +    if (is_mbm_total_enabled())
>> +        rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
>
> In this code path,
> resctrl_mkdir()->resctrl_mkdir_ctrl_mon()->rdtgroup_assign_cntrs()->rdtgroup_assign_cntr_event()
>
> CPUs are not protected by read lock while rdtgroup_assign_cntr_event() walks r->mon_domains and run assing counters code on CPUs in the domains. Without CPU protection, r->mon_domains may race with CPU hotplug.

>From what I can tell rdtgroup_assign_cntrs() is called with CPU hotplug lock held:

rdtgroup_mkdir_ctrl_mon()
{

ret = mkdir_rdt_prepare(...);
/* mkdir_rdt_prepare()->rdtgroup_kn_lock_live()->cpus_read_lock() */
...
rdtgroup_assign_cntrs(rdtgrp);
...
rdtgroup_kn_unlock(parent_kn);
/* rdtgroup_kn_unlock()->cpus_read_unlock() */
return ret;
}

Reinette