Re: [PATCH v3 2/3] selftests/resctrl: Implement cl_flush() and sb() for aarch64

From: Ben Horgan

Date: Fri Jun 26 2026 - 09:59:10 EST


Hi Reinette, Richard,

On 6/26/26 04:26, Reinette Chatre wrote:
> +Ben
>
> Hi Richard,
>
> On 5/28/26 7:23 PM, Richard Cheng wrote:
>> cl_flush() and sb() in fill_buf.c only have implementations for i386
>> and x86_64, so on aarch64 both compile to empty functions. mem_flush()
>> then walks the buffer calling a no-op cl_flush() per cache line and
>> finishes with a no-op sb(), leaving any caller that expects a flushed
>> buffer (e.g. CMT, L3_CAT) operating on unflushed state with no warning.
>>
>> Add an aarch64 code block using the ARM equivalents:
>> * "dc civac, %0" for cl_flush()
>> * "dsb sy" for sb()
>
> Calling on Arm experts here since my superficial check found sfence to
> be used for __wmb() on x86 and the Arm equivalent per
> arch/arm64/include/asm/barrier.h appears to be "dsb st"?

Referring to the arm reference manual (DDI0487 version M.a.a):
D7.5.9.15 Ordering and completion of data and instruction cache
instructions
This talks about using dsb for the synchronization and also states:
"In all cases, where the text in this section refers to a DMB or a DSB,
this means a DMB or DSB whose required access type is both loads and
stores."

Hence, in this case a "dsb st" is insufficient as the required access
type is loads but not stores. A full "dsb sy" would work to synchronize
the "dc civac".

However, I don't think "dc civac" fulfills the role of what is expected
of cl_flush().

>
> Even so, it looks like the changes below were considered by Ben during
> a previous submission but I am not able to tell if his feedback was taken
> into account here.
> Please see:
> https://lore.kernel.org/lkml/4449219d-186f-4b5e-9f72-958d1af80f58@xxxxxxx/
> https://lore.kernel.org/lkml/5aea412c-9c7f-4692-af3e-79937956ef8c@xxxxxxx/

My understanding is that the resctrl selftests want to use cl_flush(),
to invalidate entries in a system level cache for testing the cache
portion bitmaps. However, the mechanism to invalidate the system level
cache is generally implementation defined.

>
>>
>> Both instructions are EL0-accessible on Linux aarch64.
>>
>> Signed-off-by: Richard Cheng <icheng@xxxxxxxxxx>
>> ---
>> tools/testing/selftests/resctrl/fill_buf.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
>> index 19a01a52dc1a..a41d21e5a64e 100644
>> --- a/tools/testing/selftests/resctrl/fill_buf.c
>> +++ b/tools/testing/selftests/resctrl/fill_buf.c
>> @@ -27,6 +27,9 @@ static void sb(void)
>> #if defined(__i386) || defined(__x86_64)
>> asm volatile("sfence\n\t"
>> : : : "memory");
>> +#elif defined(__aarch64__)
>> + asm volatile("dsb sy\n\t"
>> + : : : "memory");
>> #endif
>> }
>>
>> @@ -35,6 +38,9 @@ static void cl_flush(void *p)
>> #if defined(__i386) || defined(__x86_64)
>> asm volatile("clflush (%0)\n\t"
>> : : "r"(p) : "memory");
>> +#elif defined(__aarch64__)
>> + asm volatile("dc civac, %0\n\t"
>> + : : "r"(p) : "memory");


This will only clean to the Point of Coherency (PoC).

To quote the arm reference manual (DDI0487 version M.a.a):
D7.5.9.2 The data cache maintenance instruction (DC)

If there are caches after the Point of Coherency and FEAT_PoPS is not
implemented, then the DC CIVAC and DC CIGDVAC instructions are not
sufficient to remove all copies of a poisoned Location and it is
IMPLEMENTATION DEFINED whether any IMPLEMENTATION DEFINED mechanism
exists to remove poison from a Location.

In most systems the slc, where your MPAM cache portions are likely to
be, will be past the PoC and I'd not expect FEAT_PoPS to be implemented.

Thanks,

Ben


>> #endif
>> }
>>
>
> Reinette