[PATCH v3 0/8] riscv: Add Ssqosid and initial CBQRI resctrl support

From: Drew Fustini

Date: Sun Jun 28 2026 - 17:18:52 EST


This series adds initial RISC-V QoS support: the Ssqosid extension [1]
(srmcfg CSR), the CBQRI controller interface [2] integrated with resctrl
[3], and a DT-based platform driver for cache controllers. It has been
tested on both the Tenstorrent Ascalon Shared Cache controller and a QEMU
implementation [4].

qemu-system-riscv64 -M virt,aia=aplic-imsic -nographic -m 1G -smp 8 \
-kernel arch/riscv/boot/Image \
-append "root=/dev/vda ro console=ttyS0 rootwait" \
-drive if=none,file=rootfs.ext2,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-device riscv.cbqri.capacity,max_mcids=256,max_rcids=64,ncblks=16,mmio_base=0x04820000

Cache allocation can be exercised on the booted system. Mount resctrl
and read the default schemata. The L2 controller has 16 capacity
blocks, so the default capacity bitmask (CBM) is 0xffff:

# mount -t resctrl resctrl /sys/fs/resctrl
# cat /sys/fs/resctrl/schemata
L2:0=ffff

Write a narrower CBM to a new control group and read it back to confirm
the L2 controller applied it:

# mkdir /sys/fs/resctrl/group0
# echo "L2:0=ff" > /sys/fs/resctrl/group0/schemata
# cat /sys/fs/resctrl/group0/schemata
L2:0=ff

Note that this series only implements support for resctrl L2 and L3
cache resources using CBQRI capacity allocation control. cc_block_mask
maps onto resctrl's existing cbm schema. However, cc_cunits is not
supported as there is no existing equivalent for capacity units in the
resctrl schemata.

I had previously been iterating on an RFC series [5] that did a full
implementation of CBQRI including capacity monitoring, bandwidth
allocation and monitoring. The bandwidth controls for CBQRI do not fit
well into resctrl's existing throttle-based MB schemata. I believe that
the path forward is Reinette's generic schema description proof of
concept [6]. My plan is to rebase the full support of CBQRI onto the
generic schema once it is ready.

This series is based on the linux-next tag next-20260623.

[1] https://github.com/riscv/riscv-ssqosid/releases/tag/v1.0
[2] https://github.com/riscv-non-isa/riscv-cbqri/releases/tag/v1.0
[3] https://docs.kernel.org/filesystems/resctrl.html
[4] https://github.com/tt-fustini/qemu/tree/riscv-cbqri-cache
[5] https://lore.kernel.org/linux-riscv/20260601-ssqosid-cbqri-rqsc-v7-0-v6-16-baf00f50028a@xxxxxxxxxx/
[6] https://lore.kernel.org/all/aab804b9-e8b5-40ad-a85b-af7033391243@xxxxxxxxx/

Changes in v3:
--------------
- riscv,cbqri.yaml:
- Require a device-specific compatible so a bare generic compatible
is no longer valid on its own.
- Rename node name from cache-controller@ to qos-controller@
- Rename compatible from tenstorrent,ascalon-sc-cbqri to
tenstorrent,ascalon-shared-cache-controller

- Take cbqri_controllers_lock in cbqri_attach_cpu_to_all_ctrls() so a
controller probed after boot can't corrupt the cbqri_controllers list.

- Revise comment above riscv_srmcfg_reset_cache() to clarify that the
teardown callback is not relied on. The cpuhp startup callback re-arms
the per-cpu sentinel, which forces the csr write on a re-onlined CPU.

- Drop the memory fences around the srmcfg CSR write on context switch.
Ssqosid does not require the ordering. The brief tagging inaccuracy at
the switch boundary is acceptable for QoS.

- Access the CBQRI controller registers with 32-bit reads and writes.
The spec only guarantees single-copy atomicity for 4-byte accesses.
This also removes the dependency on native 64-bit MMIO.

- Program cc_cunits to 0 before a config limit operation on controllers
that support capacity units, so a stale unit limit does not constrain
block-mask allocation.

- Link to v2:
https://patch.msgid.link/20260624-dfustini-atl-sc-cbqri-dt-v2-0-2f8049fd902b@xxxxxxxxxx

- Sashiko review of v2:
https://sashiko.dev/#/patchset/20260624-dfustini-atl-sc-cbqri-dt-v2-0-2f8049fd902b@xxxxxxxxxx

Changes in v2:
--------------
The changes in this revision address the Sashiko review of v1.

- Restore the srmcfg CSR for the current task on CPU_PM_EXIT and
CPU_PM_ENTER_FAILED, so it is not left configured incorrectly until
the next context switch.

- Serialize the cbqri_controllers list insert and the boot time walk
with a mutex, so an asynchronous driver probe cannot corrupt the list.

- Skip a controller at an unsupported cache level instead of aborting
resctrl setup, so valid L2 and L3 controllers still register.

- RISCV_ISA_SSQOSID selects ARCH_HAS_CPU_RESCTRL and RISCV_CBQRI
together, so no intermediate commit enables RESCTRL_FS without the
CBQRI resctrl glue.

- Rename the RISCV_CBQRI_DRIVER to RISCV_CBQRI, since it builds the
CBQRI core ops and resctrl integration rather than a driver.

- Drop the RISCV_CBQRI_DRIVER_DEBUG Kconfig option and rely on dynamic
debug to control the pr_debug() output.

- Note: Sashiko flagged the lack of suspend/resume state restore. I will
not fix that as register state is only lost when the power domain is
gated, which offlines the harts sharing the cache. resctrl reprograms
the default capacity mask through the normal control domain online
path on resume.

- Link to v1:
https://lore.kernel.org/all/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@xxxxxxxxxx/

- Sashiko review of v1:
https://sashiko.dev/#/patchset/20260619-dfustini-atl-sc-cbqri-dt-v1-0-e79a7723fab0@xxxxxxxxxx

---
Drew Fustini (8):
dt-bindings: riscv: Add Ssqosid extension description
riscv: Detect the Ssqosid extension
riscv: Add support for srmcfg CSR from Ssqosid extension
riscv_cbqri: Add capacity controller probe and allocation device ops
riscv_cbqri: resctrl: Add cache allocation via capacity block mask
riscv: Enable resctrl filesystem for Ssqosid
dt-bindings: riscv: Add binding for CBQRI controllers
riscv_cbqri: Add CBQRI capacity allocation platform driver

.../devicetree/bindings/riscv/extensions.yaml | 6 +
.../devicetree/bindings/riscv/riscv,cbqri.yaml | 97 +++
MAINTAINERS | 15 +
arch/riscv/Kconfig | 20 +
arch/riscv/include/asm/csr.h | 5 +
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/asm/processor.h | 3 +
arch/riscv/include/asm/qos.h | 74 ++
arch/riscv/include/asm/resctrl.h | 147 ++++
arch/riscv/include/asm/switch_to.h | 3 +
arch/riscv/kernel/Makefile | 2 +
arch/riscv/kernel/cpufeature.c | 1 +
arch/riscv/kernel/qos.c | 99 +++
drivers/resctrl/Kconfig | 29 +
drivers/resctrl/Makefile | 5 +
drivers/resctrl/cbqri_capacity.c | 132 ++++
drivers/resctrl/cbqri_devices.c | 562 +++++++++++++++
drivers/resctrl/cbqri_internal.h | 124 ++++
drivers/resctrl/cbqri_resctrl.c | 787 +++++++++++++++++++++
include/linux/riscv_cbqri.h | 47 ++
20 files changed, 2159 insertions(+)
---
base-commit: 4e5dfb7c84012007c3c7061126491bbc92d71bf1
change-id: 20260610-dfustini-atl-sc-cbqri-dt-410c8e2711dd

Best regards,
--
Drew Fustini <fustini@xxxxxxxxxx>