Re: [PATCH v11 06/10] fs/resctrl: Add user interface to enable/disable io_alloc feature
From: Moger, Babu
Date: Wed Nov 05 2025 - 12:31:09 EST
Hi Dave,
On 11/5/2025 10:46 AM, Dave Martin wrote:
Hi Babu,
On Thu, Oct 30, 2025 at 12:15:35PM -0500, Babu Moger wrote:
AMD's SDCIAE forces all SDCI lines to be placed into the L3 cache portions
identified by the highest-supported L3_MASK_n register, where n is the
maximum supported CLOSID.
To support AMD's SDCIAE, when io_alloc resctrl feature is enabled, reserve
the highest CLOSID exclusively for I/O allocation traffic making it no
longer available for general CPU cache allocation.
Does resctrl have a free choice for which CLOSID to use? (From the
code, it appears "yes"?)
Yes.
But in AMD systems its the highest CLOSID (15). Also, this CLOSID usage in not visible to user. There is no update of PQR_ASSOC register during the context switch. Hardware internally routes the traffic using the CLOSID's(15) limits.
Could this be exposed as a special control group? Or could IO be made
a special "task" that can be added to regular control groups?
e.g.,
# mkdir /sys/fs/resctrl/some_group
# some_group IO >/sys/fs/resctrl/some_group/tasks
This would assign the group's CLOSID to IO (in addition to any tasks
using the CLOSID).
Or, we have some special file:
# echo foo >/sys/fs/resctrl/some_group/io_devices
This would assign the group's CLOSID to the device "foo" (we'd need
some manageable naming scheme, preferably that maps in a sane way to
sysfs).
I'm not trying to rock the boat here, but for MPAM we're anticipating
the need to be able to control the CLOSID used by devices that are
behind an IOMMU. (Arm's SMMU allows a PARTID to be configured for each
device I/O context behind the SMMU.)
This is desirable for assigning devices to VMs, so that their traffic
can be managed alongside the VM.
Do you think SDCIAE could fit in with this kind of scheme?
All these things you mentioned can be done today without SDCIAE also.
It does not need SDCIAE or similar feature.
You can consolidate all the IO tasks into one group and assign limits or monitor. However, resctrl does not have the knowledge of IO devices (or names of the devices). It only knows about the tasks.
[...]
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 108995640ca5..89e856e6892c 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -152,6 +152,29 @@ related to allocation:
"not supported":
Support not available for this resource.
+ The feature can be modified by writing to the interface, for example:
+
+ To enable:
+ # echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+
+ To disable:
+ # echo 0 > /sys/fs/resctrl/info/L3/io_alloc
"info" is mostly read-only, though it does seems a reasonable place for
per-resource global controls. Today, there is already
"max_threshold_occupancy".
Agree.
It doesn't feel worth trying to introduce a new hierarchy for this kind
of thing, but the name "info" does not suggest that there are writable
controls here.
To make it official, does it make sense to add something like:
--8<--
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index fbbcf5421346..0cc9edf8d357 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -72,6 +72,10 @@ The 'info' directory contains information about the enabled
resources. Each resource has its own subdirectory. The subdirectory
names reflect the resource names.
+Most of the files in the resource's subdirectory are read-only, and describe
+properties of the resource. Resources that have global configuration options
+provide writable files here that can be used to control them.
+
Yes. It is reasonable. We can add it. How about this?
"Most of the files in the resource's subdirectory are read-only, and describe properties of the resource. Resources that support global configuration options also include writable files that can be used to modify those settings."
Thanks
Babu