Re: [PATCH v2 07/12] rv: Fix monitor start ordering and memory ordering for monitoring flag

From: Wen Yang

Date: Sun May 31 2026 - 10:55:08 EST




On 5/28/26 17:09, Nam Cao wrote:
Gabriele Monaco <gmonaco@xxxxxxxxxx> writes:
From: Wen Yang <wen.yang@xxxxxxxxx>

da_monitor_start() set monitoring=1 before calling da_monitor_init_hook(),
may racing with the sched_switch handler:

da_monitor_start() sched_switch handler
------------------------- ---------------------------------
da_mon->monitoring = 1;
if (da_monitoring(da_mon)) /* true */
ha_start_timer_ns(...);
/* hrtimer->base == NULL, crash */
da_monitor_init_hook(da_mon);
/* hrtimer_setup() sets base */

Fix the ordering and pair with release/acquire semantics:

da_monitor_init_hook(da_mon);
smp_store_release(&da_mon->monitoring, 1); /* da_monitor_start() */
return smp_load_acquire(&da_mon->monitoring); /* da_monitoring() */

On ARM64 a plain STR + LDR does not form a release-acquire pair, so
the load can observe monitoring=1 while hrtimer->base is still NULL.
The plain accesses are also data races under KCSAN.

Use WRITE_ONCE for the monitoring=0 store in da_monitor_reset() to
cover the reset path.

Fixes: 792575348ff7 ("rv/include: Add deterministic automata monitor definition via C macros")
Signed-off-by: Wen Yang <wen.yang@xxxxxxxxx>
Reviewed-by: Gabriele Monaco <gmonaco@xxxxxxxxxx>
Signed-off-by: Gabriele Monaco <gmonaco@xxxxxxxxxx>

Looks correct to me.
Reviewed-by: Nam Cao <namcao@xxxxxxxxxxxxx>

Wen, I am curious, how did you find this issue?


It was caught during development of the tlob monitor by a KUnit test
(since moved to selftests) that monitored a kthread running concurrently
on another CPU. On a multi-core box the window between setting
monitoring=1 and da_monitor_init_hook() was wide enough to trigger the
crash, and KCSAN flagged the plain monitoring accesses as data races as well.

--
Best wishes,
Wen