Re: [RFC] rcu/nocb: Fix possible bugs in rcu_barrier()

From: Joel Fernandes
Date: Mon Sep 19 2022 - 17:01:26 EST




On 9/19/2022 5:34 AM, Frederic Weisbecker wrote:
> On Sun, Sep 18, 2022 at 10:12:31PM +0000, Joel Fernandes (Google) wrote:
>> When going through the lazy-rcu work, I noticed that
>> rcu_barrier_entrain() does not really wake up the rcuog GP thread in any
>> path after entraining. This means it is possible the GP thread is not
>> awakened soon (say there were no CBs in the cblist after entraining
>> time).
>
> Right.
>
>>
>> Further, nothing appears to be calling the rcu_barrier callback
>> directly in the case the ->cblist was empty which means if the IPI gets
>> delayed enough to make the ->cblist empty and it turns out to be the last
>> CPU holding, then nothing calls completes rcu_state.barrier_completion.
>
> No need for that, if the cblist is empty there is no need for a callback
> to enqueue.
>

Thanks! I was worried about the race where an smp_call_function_single() takes a
long time to IPI. But I missed that the smp_call_function_single() in
rcu_barrier() is in a synchronous wait. I wrongly thought thought that the
waiting was facilitated by barrier_completion which has a whole different purpose.

- Joel