Re: [PATCH 02/15] x86/resctrl: Remove hard-coded memory bandwidth event configuration

From: Moger, Babu
Date: Thu Dec 07 2023 - 18:37:56 EST


Hi Reinette,

-----Original Message-----
From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
Sent: Thursday, December 7, 2023 1:02 PM
To: Moger, Babu <Babu.Moger@xxxxxxx>; corbet@xxxxxxx;
fenghua.yu@xxxxxxxxx; tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx;
bp@xxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx
Cc: x86@xxxxxxxxxx; hpa@xxxxxxxxx; paulmck@xxxxxxxxxx;
rdunlap@xxxxxxxxxxxxx; tj@xxxxxxxxxx; peterz@xxxxxxxxxxxxx;
seanjc@xxxxxxxxxx; Phillips, Kim <kim.phillips@xxxxxxx>;
jmattson@xxxxxxxxxx; ilpo.jarvinen@xxxxxxxxxxxxxxx;
jithu.joseph@xxxxxxxxx; kan.liang@xxxxxxxxxxxxxxx; Dadhania, Nikunj
<nikunj.dadhania@xxxxxxx>; daniel.sneddon@xxxxxxxxxxxxxxx;
pbonzini@xxxxxxxxxx; rick.p.edgecombe@xxxxxxxxx; rppt@xxxxxxxxxx;
maciej.wieczor-retman@xxxxxxxxx; linux-doc@xxxxxxxxxxxxxxx; linux-
kernel@xxxxxxxxxxxxxxx; eranian@xxxxxxxxxx; peternewman@xxxxxxxxxx;
Giani, Dhaval <Dhaval.Giani@xxxxxxx>
Subject: Re: [PATCH 02/15] x86/resctrl: Remove hard-coded memory
bandwidth event configuration

Hi Babu,

On 12/6/2023 11:17 AM, Moger, Babu wrote:
> On 12/6/23 12:32, Reinette Chatre wrote:
>> On 12/6/2023 9:17 AM, Moger, Babu wrote:
>>> On 12/5/23 17:21, Reinette Chatre wrote:
>>>> On 11/30/2023 4:57 PM, Babu Moger wrote:

...

>>>>> static void mondata_config_read(struct rdt_domain *d, struct
>>>>> mon_config_info *mon_info) @@ -1621,7 +1621,7 @@ static int
mbm_config_write_domain(struct rdt_resource *r,
>>>>> int ret = 0;
>>>>>
>>>>> /* mon_config cannot be more than the supported set of events */
>>>>> - if (val > MAX_EVT_CONFIG_BITS) {
>>>>> + if (val > resctrl_max_evt_bitmask) {
>>>>> rdt_last_cmd_puts("Invalid event configuration\n");
>>>>> return -EINVAL;
>>>>> }
>>>>
>>>> This does not look right. resctrl_max_evt_bitmask contains the
>>>> supported types. A user may set a value that is less than
>>>> resctrl_max_evt_bitmask but yet have an unsupported bit set, no?
>>>
>>> I think I have to make this clear in the patch. There is no
>>> difference in the definition. Hardware supports all the events reported by
the cpuid.
>>
>> I'll try to elaborate using an example. Let's say AMD decides to make
>> hardware with hypothetical support mask of:
>> resctrl_max_evt_bitmask = 0x4F (no support for Slow Mem).
>>
>> What if user attempts to set config that enables monitoring of Slow Mem:
>> val = 0x30
>>
>> In the above example, val is not larger than resctrl_max_evt_bitmask
>> but it is an invalid config, no?
>
> Yes. It is invalid config in this case.
>
> How about changing the check to something like this?
>
> if ((val & resctrl_max_evt_bitmask) != val) {
> rdt_last_cmd_puts("Invalid event configuration\n");
> return -EINVAL;
> }

This would address the scenario. I also think that it will be helpful to print the
valid bitmask as part of the error message. The original implementation
specified that all bits are valid and in doing so no interface accompanied the
feature to share with users what the valid bits are. The only way user space
can learn this is is to read the *_config files after the first resctrl mount after a
system boot to see with which config values the system was initialized with
(assuming system was initialized with all supported bits enabled).

Sure. Will add the error message including the valid bitmask.
Thanks
Babu