real-time preemption and RCU

From: James Huang
Date: Thu Jun 11 2009 - 19:04:43 EST



Hi Paul,
 
        I have read through your year 2005 document on real-time preemption and RCU.
It was very interesting and your approach to the problem (by gradual improvement in each new implementation) make the idea very clear.
However, I am baffled by the following potential race condition that exists in implementation 2 through 5.
To keep the case simple, let's choose implementation 2 to illustrate:
 
 
CPU0           |<-- delete M1 -->|             ||           |<---- delete M2 --->|        <------  delete M3 ---->|
                                                          || 
                                                          ||
CPU1      |<-----   read M1--------->|         ||    |<-------------------------    read M2  --------------------------------------->|
                                                          ||     
                                                          ||
CPU2                                             time T: execute synchronize_kernel: rcu_ctrlblk.batch++                

Assume initially

rcu_data[cpu0].batch = 1
rcu_data[cpu1].batch = 1
rcu_data[cpu2].batch = 1
rcu_ctrlblk.batch = 1

The following steps are executed:
(1)  cpu1 read-locked rcu_ctrlblk.lock, read M1, read-unlocked rcu_ctrlblk.lock
(2)  cpu0 deleted M1
(3)  At time T (marked by || ), cpu2 executed synchronize_kernel: write-locked rcu_ctrlblk.lock, incremented rcu_ctrlblk.batch to 2, and write-unlocked rcu_ctrlblk.lock
(4) cpu1 read-locked rcu_ctrlblk.lock, spent a long time in its rcu read-side critical section, read M2, read-unlocked rcu_ctrlblk.lock
(5) cpu0 deleted M2.  But when it executed run_rcu(), cpu0 DID NOT see the most up-to-date value of rcu_ctrlblk.batch.
     So cpu0 just inserted M2 into cpu0's waitlist, but did not free up M1 and did not update rcu_data[cpu0].batch (i.e. it was still equal to 1).
(6) cpu0 deleted M3. At this time cpu0 saw the most up-to-date value of rcu_ctrlblk.batch (2).
     Since rcu_ctrlblk.batch (2) is larger than rcu_data[cpu0].batch (1), cpu0 freed up memory blocks in its waitlist.
     So both M1 and M2 were freed up by cpu0.  But if cpu1 was still accessing M2, this will be a problem.

Am I missing something here?  Does the smp_mb() within run_do_my_batch() has anything to do with this issue?


-- James Huang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/