The rcu_barrier() family of functions registers a callback on each CPU,You are right: I mixed up rcu_barrier() and synchronize_rcu().
and waits until all these callbacks have been invoked. The CPU offlining
process preserves the order of the callbacks that were registered on a
given CPU. Thus, when rcu_barrier() returns, all RCU callbacks previously
registered are guaranteed to have already been invoked, regardless of
what CPUs might have been offlined and onlined in the meantime.