It's an optimization question: What is rarer? force_quiescent_state() or "normal" cpu_quiet calls.+ */I'm not sure if a bitmap is the right storage. If I understand the code correctly, it contains two information:
+struct rcu_node {
+ spinlock_t lock;
+ unsigned long qsmask; /* CPUs or groups that need to switch in */
+ /* order for current grace period to proceed.*/
+ unsigned long qsmaskinit;
+ /* Per-GP initialization for qsmask. */
1) If the bitmap is clear, then all cpus have completed whatever they need to do.
A counter is more efficient than a bitmap. Especially: It would allow to choose the optimal fan-out, independent from 32/64 bits.
2) The information if the current cpu must do something to complete the current period.non
This is a local information, usually (always?) only the current cpu needs to know if it must do something.
But this doesn't need to be stored in a shared structure, the information could be stored in a per-cpu structure.
I am using the bitmap in force_quiescent_state() to work out who to
check dynticks and who to send reschedule IPIs to. I could scan all
of the per-CPU rcu_data structures, but am assuming that after a few
jiffies there would typically be relatively few CPUs still needing to do
a quiescent state. Given this assumption, on systems with large numbers
of CPUs, scanning the bitmask greatly reduces the number of cache misses
compared to scanning the rcu_data structures.