Re: [PATCH v3 8/9] x86/resctrl: Ensure domain fully initialized before placed on RCU list
From: Moger, Babu
Date: Thu May 28 2026 - 19:10:47 EST
Hi Reinette,
On 5/28/2026 3:56 PM, Reinette Chatre wrote:
Hi Babu,
On 5/28/26 12:04 PM, Babu Moger wrote:
On 5/28/26 11:11, Reinette Chatre wrote:
On 5/22/26 12:15 PM, Reinette Chatre wrote:
static void l3_mon_domain_setup(int cpu, int id, struct rdt_resource *r, struct list_head *add_pos)
@@ -556,14 +554,12 @@ static void l3_mon_domain_setup(int cpu, int id, struct rdt_resource *r, struct
return;
}
- list_add_tail_rcu(&d->hdr.list, add_pos);
-
err = resctrl_online_mon_domain(r, &d->hdr);
if (err) {
- list_del_rcu(&d->hdr.list);
- synchronize_rcu();
l3_mon_domain_free(hw_dom);
+ return;
}
+ list_add_tail_rcu(&d->hdr.list, add_pos);
}
static void domain_add_cpu_mon(int cpu, struct rdt_resource *r)
I resubmitted the last three patches of series to obtain Sashiko review [1] and
respond to that feedback here:
a) Sashiko: "Does this reordering expose the monitor directories to userspace before the
domain is actually added to the RCU list?"
Yes. As pointed out by Sashiko there is a short time where the monitoring data files
may be exposed to user space before the monitoring domain is added to the RCU list.
Also pointed out by Sashiko, if user attempts to read from such file it will return
-ENOENT.
This behavior looks correct and acceptable to me.
Yes. I agree,
Thank you for taking a look.
b) Sashiko: "This is a pre-existing issue, but does resctrl_find_domain() safely traverse
the RCU list?
Yes, this is safe because resctrl_find_domain() is run with cpus_read_lock() that ensures
the list can be traversed safely because the list can only be modified with
CPU write lock held.
One improvement to resctrl that would help support this is to replace the "list_for_each()"
domain list traversals with something like:
list_for_each_entry_rcu(pos, head, member, lockdep_is_cpus_held())
Doing something like above would help document why such list traversal outside of
RCU read-side critical section is safe.
Are you planning to make this change at this time? It appears to be a corner case, and we haven’t observed it in real-world scenarios.
I am considering it, yes, and was planning to make the change first to determine the impact on this series
before deciding. This is not a functional change but instead making the code conform to best practice that
implicitly documents why it is safe to traverse the list outside of an RCU read-side critical section. I
think this would be a nice addition to resctrl. Including it in this series may actually help reviewers consider
all the flows involved in these races. Do you think this should be deferred? resctrl cleanup work is not
popular [2] ...
If it’s as simple as just using the new call (list_for_each_entry_rcu()), then that should be fine.
I think this is a good candidate for deferral.
Sashiko appears to be reporting every possible scenario, so we may need to decide what should be prioritized in such cases. Finally, it is your(maintainers) call.
Thanks,
Babu