Re: [PATCH] x86/resctrl: Fix memory bandwidth counter width for AMD
From: Babu Moger
Date: Tue Jun 02 2020 - 13:33:44 EST
On 6/2/20 12:13 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/1/2020 4:00 PM, Babu Moger wrote:
>> Memory bandwidth is calculated reading the monitoring counter
>> at two intervals and calculating the delta. It is the softwareâs
>> responsibility to read the count often enough to avoid having
>> the count roll over _twice_ between reads.
>>
>> The current code hardcodes the bandwidth monitoring counter's width
>> to 24 bits for AMD. This is due to default base counter width which
>> is 24. Currently, AMD does not implement the CPUID 0xF.[ECX=1]:EAX
>> to adjust the counter width. But, the AMD hardware supports much
>> wider bandwidth counter with the default width of 44 bits.
>>
>> Kernel reads these monitoring counters every 1 second and adjusts the
>> counter value for overflow. With 24 bits and scale value of 64 for AMD,
>> it can only measure up to 1GB/s without overflowing. For the rates
>> above 1GB/s this will fail to measure the bandwidth.
>>
>> Fix the issue setting the default width to 44 bits by adjusting the
>> offset.
>>
>> AMD future products will implement the CPUID 0xF.[ECX=1]:EAX.
>>
>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>
> There is no fixes tag but if I understand correctly this issue has been
> present since AMD support was added to resctrl. This fix builds on top
> of a recent feature addition and would thus not work for earlier
> kernels. Are you planning to create a different fix for earlier kernels?
Yes. This was there from day one. I am going to back port to older kernels
once we arrive on the final patch. Do we need fixes tag here?