Re: [PATCH] arm64/mpam: Support partial-core boot for MPAM
From: Ben Horgan
Date: Mon Feb 02 2026 - 06:34:39 EST
Hi Zeng,
On 2/2/26 09:16, Zeng Heng wrote:
>
>
> On 2026/2/2 16:41, Zeng Heng wrote:
>>
>>
>> On 2026/1/29 18:11, Ben Horgan wrote:
>>> Hi Zeng,
>>>
>>> I think I've just managed to whitelist your email address. So, all being
>>> well I'll get your emails in my inbox.
>>>
>>> On 1/7/26 03:13, Zeng Heng wrote:
>>>> Some MPAM MSCs (like L2 MSC) shares the same power domain with its
>>>> associated CPUs. Therefore, in scenarios where only partial cores power
>>>> up, the MSCs belonging to the un-powered cores don't need and should
>>>> not
>>>> be accessed, otherwise bus-access fault would occur.
>>>
>>> The MPAM driver intentionally to waits until all MSCs have been
>>> discovered before allowing MPAM to be used so that it can check the
>>> properties of all the MSC and determine the configuration based on full
>>> knowledge. Once a CPU affine with each MSC has been enabled then MPAM
>>> will be enabled and usable.
>>>
>>> Suppose we weren't to access all MSCs in an asymmetric configuration.
>>> E.g. if different L2 had different lengths of cache portion bit maps and
>>> MPAM was enabled with only the CPUs with the same L2 then the driver
>>> wouldn't know and we'd end up with a bad configuration which would
>>> become a problem when the other CPUs are eventually turned on.
>>>
>>> Hence, I think we should retain the restriction that MPAM is only
>>> enabled once all MSC are probed. Is this a particularly onerous
>>> resctriction for you?
>>>
>>
>> I have no objection to the restriction that "MPAM is only enabled once
>> all MSC are probed." This constraint ensures the driver has complete
>> knowledge of all Memory System Components before establishing the
>> configuration.
>>
>>
>> However, this patch is specifically designed to address CPU core
>> isolation scenarios (Such as adding the 'isolcpus=xx' kernel command
>> line parameter).
In the isolation scenario are you for some cpus, enabling MPAM, using
those cpus but not taking into account the parameters of the associated MSC?
>>
>> The patch allows the MPAM driver to successfully complete the
>> initialization of online MSCs even when the system is booted with
>> certain cores isolated or disabled. The patch ensures that MPAM
>> initialization is decoupled from the requirement that all CPUs must be
>> online during the probing phase.
>>
>> CPU core isolation is indeed a common production scenario. This
>> functionality requires the kernel to enable functionalities in the
>> presence of faulty cores (which cannot be recovered through cold boot).
>> This ensures system reliability and availability on multi-core
>> processors where single-core faults.
>>
>> Without this patch would prevent MPAM from initialization under CPU core
>> isolation scenarios. Apologies for not mentioning in the patch: we can
>> verify the functionality by adding 'maxcpus=1' to the boot parameters.
For 'maxcpus=1' I think the correct behaviour is to not enable MPAM as
the other CPUs can then be turned on afterwards. E.g by
echo 1 > /sys/devices/system/cpu/cpuX/online
For faulty cores how would you ensure they are never turned on?
>>
>> Please let me know if you have any further questions or concerns.
>>
>>
>> Best Regards,
>> Zeng Heng
>>
>>
>
> My platform consists of 12 clusters, each containing 16 CPU cores. Under
> normal boot conditions, the schemata is as follows:
Thank you for sharing this information.
>
> # mount -t resctrl l resctrl /sys/fs/resctrl/
> # cat /sys/fs/resctrl/schemata
> L3:1=1ffff;26=1ffff;51=1ffff;76=1ffff;101=1ffff;126=1ffff;151=1ffff;
> 176=1ffff;201=1ffff;226=1ffff;251=1ffff;276=1ffff
>
> Adding 'maxcpus=1' to the boot parameters:
> Without this patch, MPAM initialization fails.
> With the patch, MPAM initialization succeeds, the number of clusters
> matches expectations, and the schemata is as follows:
>
> # mount -t resctrl l resctrl /sys/fs/resctrl/
> # cat schemata
> L3:1=1ffff
>
>
> Thanks,
> Zeng Heng
>
Thanks,
Ben