Re: [PATCH 02/15] x86/resctrl: Remove hard-coded memory bandwidth event configuration

From: Reinette Chatre
Date: Thu Dec 07 2023 - 14:02:42 EST


Hi Babu,

On 12/6/2023 11:17 AM, Moger, Babu wrote:
> On 12/6/23 12:32, Reinette Chatre wrote:
>> On 12/6/2023 9:17 AM, Moger, Babu wrote:
>>> On 12/5/23 17:21, Reinette Chatre wrote:
>>>> On 11/30/2023 4:57 PM, Babu Moger wrote:

...

>>>>> static void mondata_config_read(struct rdt_domain *d, struct mon_config_info *mon_info)
>>>>> @@ -1621,7 +1621,7 @@ static int mbm_config_write_domain(struct rdt_resource *r,
>>>>> int ret = 0;
>>>>>
>>>>> /* mon_config cannot be more than the supported set of events */
>>>>> - if (val > MAX_EVT_CONFIG_BITS) {
>>>>> + if (val > resctrl_max_evt_bitmask) {
>>>>> rdt_last_cmd_puts("Invalid event configuration\n");
>>>>> return -EINVAL;
>>>>> }
>>>>
>>>> This does not look right. resctrl_max_evt_bitmask contains the supported
>>>> types. A user may set a value that is less than resctrl_max_evt_bitmask but
>>>> yet have an unsupported bit set, no?
>>>
>>> I think I have to make this clear in the patch. There is no difference in
>>> the definition. Hardware supports all the events reported by the cpuid.
>>
>> I'll try to elaborate using an example. Let's say AMD decides to make
>> hardware with hypothetical support mask of:
>> resctrl_max_evt_bitmask = 0x4F (no support for Slow Mem).
>>
>> What if user attempts to set config that enables monitoring of Slow Mem:
>> val = 0x30
>>
>> In the above example, val is not larger than resctrl_max_evt_bitmask
>> but it is an invalid config, no?
>
> Yes. It is invalid config in this case.
>
> How about changing the check to something like this?
>
> if ((val & resctrl_max_evt_bitmask) != val) {
> rdt_last_cmd_puts("Invalid event configuration\n");
> return -EINVAL;
> }

This would address the scenario. I also think that it will be helpful to
print the valid bitmask as part of the error message. The original implementation
specified that all bits are valid and in doing so no interface accompanied the
feature to share with users what the valid bits are. The only way user space can learn
this is is to read the *_config files after the first resctrl mount after a system boot
to see with which config values the system was initialized with (assuming system was
initialized with all supported bits enabled).

Reinette