Re: [PATCH v15-RFC 0/8] Add support for Sub-NUMA cluster (SNC) systems
From: Reinette Chatre
Date: Fri Feb 09 2024 - 19:29:22 EST
Hi Tony,
On 2/9/2024 3:44 PM, Luck, Tony wrote:
>> I actually had specific points that this response also ignores.
>> Let me repeat and highlight the same points:
>>
>> 1) You claim that this series "removes the need for separate domain
>> lists" ... but then this series does just that (create a separate
>> domain list), but in an obfuscated way (duplicate the resource to
>> have the monitoring domain list in there).
>
> That was poorly worded on my part. I should have said "removes the
> need for separate domain lists within a single rdt_resource".
>
> Adding an extra domain list to a resource may be the start of a slippery
> slope. What if there is some additional "L3"-like resctrl operation that
> acts at the socket level (Intel has made products with multiple L3
> instances per socket before). Would you be OK add a third domain
> list to every struct rdt_resource to handle this? Or would it be simpler
> to just add a new rdt_resource structure with socket scoped domains?
This should not be about what is simplest to patch into current resctrl.
There is no need to support a new domain list for a new scope. The domain
lists support the functionality: control or monitoring. If control has
socket scope the existing implementation supports that.
If there is another operation supported by a resource apart from
control or monitoring then we can consider how to support it when
we know what it is. That would also be a great point to decide if
the same data structure should just grow to support an operation that
not all resources may support. That may depend on the amount of data
needed to support this hypothetical operation.
>
>> 2) You claim this series "reduces amount of code churn", but this is
>> because this series keeps using the same original data structures
>> for separate monitoring and control usages. The previous series made
>> an effort to separate the structures for the different usages
>> but this series does not. What makes it ok in this series to
>> use the same data structures for different usages?
>
> Legacy resctrl has been using the same rdt_domain structure for both
> usages since the dawn of time. So it has been OK up until now.
This is not the same.
Legacy resctrl uses the same data structure in the same list for both control
and monitoring usages so it is fine to have both monitoring and control data
in the data structure.
What you are doing in both solutions is to place the same data structure
in separate lists for control and monitoring usages. In the one list only the
control data is used, on the other only the monitoring data is used.
>> Additionally:
>>
>> Regarding "Vast amounts of that just added "_mon" or "_ctrl" to structure
>> or variable names." ... that is because the structures are actually split,
>> no? It is not just renaming for unnecessary churn.
>
> Perhaps not "unnecessary" churn. But certainly a lot of code change for
> what I perceive as very little real gain.
ok. There may be little gain wrt saving space. One complication with
this single data structure is that its content may only be decided based
on which list it is part of. It should be obvious to developers when
which members are valid. Perhaps this can be addressed with clear
documentation of the data structures.
>
>> What is the benefit of keeping the data structures to be shared
>> between monitor and control usages?
>
> Benefit is no code changes. Cost is continuing to waste memory with
> structures that are slightly bigger than they need to be.
>
>> If there is a benefit to keeping these data structures, why not just
>> address this aspect in previous solution?
>
> The previous solution evolved to splitting these structures. But this
> happened incrementally (remember that at an early stage the monitor
> structures all got the "_mon" addition to their names, but the control
> structures kept the original names). Only when I got to the end of this
> process did I look at the magnitude of the change.
Not answering my question.
Reinette