Re: [PATCH] cacheinfo: clear cache_leaves(cpu) in free_cache_attributes()

From: Xiongfeng Wang
Date: Tue Jul 13 2021 - 22:10:38 EST


Hi James,

On 2021/7/14 1:38, James Morse wrote:
> Hello,
>
> On 13/07/2021 14:26, Sudeep Holla wrote:
>> On Tue, Jul 13, 2021 at 08:46:19PM +0800, Xiongfeng Wang wrote:
>>> On 2021/7/13 19:33, Sudeep Holla wrote:
>>>> On Tue, Jul 13, 2021 at 11:47:38AM +0800, Xiongfeng Wang wrote:
>>>>> On ARM64, when PPTT(Processor Properties Topology Table) is not
>>>>> implemented in ACPI boot, we will goto 'free_ci' with the following
>>>>> print:
>>>>> Unable to detect cache hierarchy for CPU 0
>>>>>
>>>>
>>>> The change itself looks good and I am fine with that. However,...
>>>>
>>>>> But some other codes may still use 'num_leaves' to iterate through the
>>>>
>>>> Can you point me exactly where it is used to make sure there are no
>>>> other issues associated with that.
>>>>
>>>>> 'info_list', such as get_cpu_cacheinfo_id(). If 'info_list' is NULL , it
>>>>> would crash. So clear 'num_leaves' in free_cache_attributes().
>>>>>
>>>>
>>>> And can you provide the crash dump please ? If we are not hitting any
>>>> issue and you just figured this with code inspection, that is fine. It
>>>> helps to determine if this needs to be backport or just good to have
>>>> clean up.
>
>>> There is no issue in the mainline kernel. get_cpu_cacheinfo_id() is only called
>>> on x86. I didn't hit any issue using the mainline kernel.
>
>>> Actually, it's our own code that crashed. My colleague Shaobo(CCed) tried to add
>
> Seems to have dropped off the CC list.

Yes. I don't know why I didn't CC him success. CCed again.

>
>>> MPAM support on ARM64.
>
> Do you want me to CC either of you on the series that refactor the resctrl code? This is
> the bit that needs doing to get MPAM working upstream

It would be nice if you could CC him. His email address is
bobo.shaobowang@xxxxxxxxxx. Thanks a lot !

Below is the (openEuler version) MPAM support code he wrote based on your
private version in linux-arm.org repo:
https://gitee.com/openeuler/kernel/tree/openEuler-1.0-LTS/arch/arm64/kernel/mpam
It would be appreciated if you could give some advice.

console display:
[root@localhost ~]# mount -t resctrl resctrl /sys/fs/resctrl/ -o
cdpl3,caPbm,mbHdl,mbMax
[root@localhost ~]# cd /sys/fs/resctrl/
[root@localhost resctrl]# cat schemata
L3CODEPBM:0=7fff;1=7fff;2=7fff;3=7fff
L3DATAPBM:0=7fff;1=7fff;2=7fff;3=7fff
MBHDL:0=1;1=1;2=1;3=1
MBMAX:0=100;1=100;2=100;3=100

Thanks,
Xiongfeng

>
> (I copy Shameerali, but I've not heard from him in a while.)
>
>
>>> His code called get_cpu_cacheinfo_id() and crashed when
>>> PPTT is not implemented. Maybe he should check whether PPTT is implemented
>>> before calling get_cpu_cacheinfo_id(), but we think it is also better to clear
>>> cache_leaves(cpu) in free_cache_attributes().
>>> Sorry for not clearly expressed.
>
> The ACPI tables for MPAM reference the PPTT, so you're going to need one.
>
>
>> Thanks for detailed explanation. In this case I would drop the Fixes: tag
>> as it is not fixing anything in the commit mentioned in the tag.
>>
>> Also not sure if we can tag this as fixes
>> 709c4362725a ("cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file")
>> as that is introducing the possible access that could crash. @James ?
>
> If you like. If there is nothing broken its hard to care.
> I guess this helps people doing backports.
>
>
>
> Thanks,
>
> James
> .
>