Re: [PATCH v2 05/18] x86/resctrl: Allow RMID allocation to be scoped by CLOSID

From: James Morse
Date: Fri Mar 03 2023 - 13:34:30 EST


Hi Fenghua,

On 17/01/2023 18:53, Yu, Fenghua wrote:
>> MPAMs RMID values are not unique unless the CLOSID is considered as well.
>>
>> alloc_rmid() expects the RMID to be an independent number.
>>
>> Pass the CLOSID in to alloc_rmid(). Use this to compare indexes when allocating.
>> If the CLOSID is not relevant to the index, this ends up comparing the free RMID
>> with itself, and the first free entry will be used. With MPAM the CLOSID is
>> included in the index, so this becomes a walk of the free RMID entries, until one
>> that matches the supplied CLOSID is found.


>> b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index dbae380e3d1c..347be3767241 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -301,25 +301,51 @@ bool has_busy_rmid(struct rdt_resource *r, struct
>> rdt_domain *d)
>> return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; }
>>
>> +static struct rmid_entry *resctrl_find_free_rmid(u32 closid) {
>> + struct rmid_entry *itr;
>> + u32 itr_idx, cmp_idx;
>> +
>> + if (list_empty(&rmid_free_lru))
>> + return rmid_limbo_count ? ERR_PTR(-EBUSY) : ERR_PTR(-
>> ENOSPC);
>> +
>> + list_for_each_entry(itr, &rmid_free_lru, list) {
>> + /*
>> + * get the index of this free RMID, and the index it would need
>> + * to be if it were used with this CLOSID.
>> + * If the CLOSID is irrelevant on this architecture, these will
>> + * always be the same. Otherwise they will only match if this
>> + * RMID can be used with this CLOSID.
>> + */
>> + itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid);
>> + cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid);
>> +
>> + if (itr_idx == cmp_idx)
>> + return itr;
>
> Finding free rmid may be called frequently depending on usage.
>
> It would be better to have a simpler and faster arch helper that finds the itr on x86.
> Something like:
> struct rmid_entry *resctrl_arch_rmid_matchd(u32 ignored, u32 ignored)
> {
> return list_entry_first(resctrl_free_lru, itr, list);
> }
>
> Arm64 implements the complex case going through the rmid_free_lru list in the patch.

The trick here is that one degenerates into the other:

>> + list_for_each_entry(itr, &rmid_free_lru, list) {

The first time round the loop, this is equivalent to:
| itr = list_entry_first(&rmid_free_lru, itr, list);


>> + /*
>> + * get the index of this free RMID, and the index it would need
>> + * to be if it were used with this CLOSID.
>> + * If the CLOSID is irrelevant on this architecture, these will
>> + * always be the same. Otherwise they will only match if this
>> + * RMID can be used with this CLOSID.
>> + */
>> + itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid);

On x86, after inline-ing this is:
| itr_idx = itr->rmid

>> + cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid);

and this is:
| cmp_idx = itr->rmid

>> + if (itr_idx == cmp_idx)
>> + return itr;

So now any half decent compiler can spot that this condition is always true and the loop
only ever runs once, and the whole thing reduces to what you wanted it to be.

This saves exposing things that should be private to the filesystem code and having
per-arch helpers to mess with it.

The commit message described this, I'll expand the comment in the loop to be:
| * If the CLOSID is irrelevant on this architecture, these will
| * always be the same meaning the compiler can reduce this loop
| * to a single list_entry_first() call.


Thanks,

James