Re: [QUESTION] srcu: Remove the SCAN2 state

From: Paul E. McKenney
Date: Thu Feb 22 2018 - 11:54:10 EST


On Thu, Feb 22, 2018 at 02:05:18PM +0900, Byungchul Park wrote:
> On 2/22/2018 11:11 AM, Paul E. McKenney wrote:
> >On Thu, Feb 22, 2018 at 08:57:27AM +0900, Byungchul Park wrote:
> >>Hello,
> >>
> >>I'm sorry for bothering you, and I seem to be obviously missing
> >>something, but I'm really wondering why we check try_check_zero()
> >>again in the state, SCAN1, for the previous srcu_idx.
> >>
> >>I mean, since we've already checked try_check_zero() in the previous
> >>grace period and gotten 'true' as a return value, all readers who see
> >>the flipped idx via srcu_flip() won't update the src_{lock,unlock}_count
> >>for the previous idx until it gets flipped back again.
> >>
> >>Is there any reasons we check try_check_zero() again in the state, SCAN1?
> >>Is there any problems if the following patch's applied?
> >
> >Indeed there are! Removing the second scan exposes us to a nasty race
> >condition where a reader is preempted (or interrupted or whatever) just
>
> Indeed! I missed the cases. It should be as it is.
>
> Thanks a lot for pointing it out.

Heh! Everyone I know, myself included, who has written such an algorithm
has had this bug in their initial version. In one case, the algorithm
was published in a high-end journal and the bug not spotted for more than
a decade. I suppose I could brag about Mathieu's and my offerings having
been corrected before we published, but the fact remains that an earlier
publication of mine gave the aforementioned algorithm from the high-end
journal as an alternative implementation, and I did not spot the bug.
Nor did any of my co-authors. ;-)

Thanx, Paul

> >after fetching its counter. A detailed explanation for an essentially
>
> --
> Thanks,
> Byungchul
>