Re: [patch 0/9] Fix various __task_cred related invalid RCUassumptions

From: Paul E. McKenney
Date: Wed Dec 09 2009 - 21:28:48 EST


On Thu, Dec 10, 2009 at 12:52:46AM -0000, Thomas Gleixner wrote:
> While auditing the read_lock(&tasklist_lock) sites for a possible
> conversion to rcu-read_lock() I stumbled over an unprotected user of
> __task_cred in kernel/sys.c
>
> That caused me to audit all the __task_cred usage sites except in
> kernel/exit.c.
>
> Most of the usage sites are correct, but some of them trip over
> invalid assumptions about the protection which is given by RCU.
>
> - spinlocked/preempt_disabled regions are equivalent to rcu_read_lock():
>
> That's wrong. RCU does not guarantee that.
>
> It has been that way due to implementation details and it still is
> valid for CONFIG_TREE_PREEMPT_RCU=n, but there is no guarantee that
> this will be the case forever.

To back this up, item #2 from Documentation/RCU/checklist.txt says:

2. Do the RCU read-side critical sections make proper use of
rcu_read_lock() and friends? These primitives are needed
to prevent grace periods from ending prematurely, which
could result in data being unceremoniously freed out from
under your read-side code, which can greatly increase the
actuarial risk of your kernel.

As a rough rule of thumb, any dereference of an RCU-protected
pointer must be covered by rcu_read_lock() or rcu_read_lock_bh()
or by the appropriate update-side lock.

> - interrupt disabled regions are equivalent to rcu_read_lock():
>
> Wrong again. RCU does not guarantee that.
>
> It's true for current mainline, but again this is an implementation
> detail and there is no guarantee by the RCU semantics.
>
> Indeed we want to get rid of that to avoid scalability issues on
> large systems and preempt-rt got already rid of it to a certain
> extent.

Same item #2 above covers this.

The only exception is when you use synchronize_sched(), as described
in the "Defer"/"Protect" list near line 323 of the 2.6.32 version of
Documentation/RCU/whatisRCU.txt:

Defer Protect

a. synchronize_rcu() rcu_read_lock() / rcu_read_unlock()
call_rcu()

b. call_rcu_bh() rcu_read_lock_bh() / rcu_read_unlock_bh()

c. synchronize_sched() preempt_disable() / preempt_enable()
local_irq_save() / local_irq_restore()
hardirq enter / hardirq exit
NMI enter / NMI exit

And yes, I need to update this based on the addition of
rcu_read_lock_sched() and friends. I will be doing another
documentation update soon.

> I'm sure we are lucky that CONFIG_TREE_PREEMPT_RCU=y is not yet wide
> spread and the code pathes are esoteric enough not to trigger that
> subtle races (some of them might just error out silently).
>
> Nevertheless we need to fix all invalid assumptions about RCU
> protection.

Agreed!!!

> The following patch series fixes all yet known affected __task_cred()
> sites, but there is more auditing of all other rcu users necessary.

Thank you very much for putting this series together -- I will take
a quick look at them.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/