Re: [PATCH v2 -rcu] srcu: Use rcu_seq_done_exact() for polling API
From: Kent Overstreet
Date: Wed Feb 19 2025 - 16:11:53 EST
On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote:
>
>
> On 2/19/2025 8:22 AM, Paul E. McKenney wrote:
> > On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote:
> >> poll_state_synchronize_srcu() uses rcu_seq_done() unlike
> >> poll_state_synchronize_rcu() which uses rcu_seq_done_exact().
> >>
> >> The rcu_seq_done_exact() makes more sense for polling API, as with
> >> this API, there is a higher chance that there is a significant delay
> >> between the get_state..() and poll_state..() calls since a cookie
> >> can be stored and reused at a later time. During such a delay, if
> >> the gp_seq counter progresses more than ULONG_MAX/2 distance, then
> >> poll_state..() may return false for a long time unwantedly.
> >>
> >> Fix by using the more accurate rcu_seq_done_exact() API which is
> >> exactly what straight RCU's polling does.
> >>
> >> It may make sense, as future work, to add debug code here as well, where
> >> we compare a physical timestamp between get_state..() and poll_state()
> >> calls and yell if significant time has past but the grace period has
> >> still not progressed.
> >>
> >> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@xxxxxxx>
> >> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx>
> >
> > Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >
> > But we should also run this by Kent Overstreet, given that bcachefs
> > uses this. Should be OK, given that bcachefs uses this API in the same
> > way that it does poll_state_synchronize_rcu(), but still...
>
> Thanks Paul! Adding Kent Overstreet to the email to raise any objections.
It sounds like rcu_done_exact() is indeed what we want - bcachefs uses
this for determining when objects may be reclaimed (as is typical with
rcu), so we don't want objects to be stranded a "significant time past
the grace period".
Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's
tree yet. Minor additional overhead would be totally fine; we use this
from fs/bcachefs/rcu_pending.c which doesn't call it for each object.