[PATCH 0/4] x86,fs/resctrl: kernel-mode (PLZA) fixes found during review

From: Qinyun Tan

Date: Thu Jun 11 2026 - 07:17:56 EST


Hi Babu,

While reviewing this v3 series I found a few issues in the kernel-mode
(PLZA) support and wrote a fix for each.
I'm sending them as a small follow-up set on top of v3 so they are easy
to fold into the next revision, or to take as separate patches --
whichever you prefer. The patches are ordered by dependency (build fix
-> semantic fix -> the two binding fixes) so the series is bisectable on
top of v3.

Patch 1 (ARM MPAM build fix): fs/resctrl now calls
resctrl_arch_get_kmode_support()/resctrl_arch_configure_kmode(), which
are only implemented on x86, so an aarch64 allyesconfig (MPAM) fails to
link. Add empty arch stubs, and hide info/kernel_mode on platforms that
advertise no mode beyond inherit_ctrl_and_mon.

Patch 2 (RMID_EN + RDTMON_GROUP): RMID_EN is hardcoded to 1, so
inherit_mon counts kernel-mode traffic under the PLZA RMID instead of
inheriting from PQR_ASSOC; and assign_mon is forced to bind an
RDTMON_GROUP, wasting an RMID. Make RMID_EN mode-based and let assign_mon
also accept a control group. This is the issue we discussed earlier and
you confirmed; this is the patch for it.

Patch 3 (atomic switch): resctrl_kernel_mode_write() releases the
previous binding before it programs the new one. If programming the new
binding fails (-ENOMEM, or a pseudo-locked target group), the old,
working binding is already gone -- a user who only tried to switch loses
the original configuration too. Make the switch atomic: all fallible
work is done before the old binding is released, so a failed switch is a
no-op.

Patch 4 (CPU online): the PLZA MSR is per-CPU and is only written over
the CPUs that are online at bind time / mask change; nothing reprograms a
CPU that comes online afterwards. A hot-added vCPU, or a CPU that was
offline at bind time, then runs with PLZA off although it is in scope,
while info/kernel_mode still reports the binding as active. Drive the
per-CPU state from resctrl_online_cpu() so it is synced idempotently on
every online (and stale enable bits are cleared for a CPU that left the
scope while offline).

Concretely, the patch 4 failure mode is: offline a CPU, bind a
global-assign mode while it is absent, then online it -- the onlined CPU
is left with PLZA_EN=0 although it is in scope, while a CPU that was
present at bind time has PLZA_EN=1, so its CPL0 traffic is not accounted
to the bound kernel-mode group.

I'd appreciate your view on whether these match your intent for the
design.

Qinyun Tan (4):
resctrl: Add kmode arch stubs for ARM MPAM and hide kernel_mode on
non-PLZA platforms
resctrl: Fix PLZA RMID_EN to be mode-based and relax RDTMON_GROUP
constraint for assign_mon
fs/resctrl: make a failed kernel-mode switch a no-op
fs/resctrl: program PLZA on a CPU that comes online under a binding

arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 9 +-
drivers/resctrl/mpam_resctrl.c | 9 +
fs/resctrl/rdtgroup.c | 235 ++++++++++++++--------
include/linux/resctrl.h | 8 +-
4 files changed, 171 insertions(+), 90 deletions(-)

--
2.43.7