Re: [RFC][PATCH] identify in_dev_get rcu read-side critical sections
From: Paul E. McKenney
Date: Wed Sep 28 2005 - 09:51:43 EST
On Wed, Sep 28, 2005 at 12:55:45PM +1000, Herbert Xu wrote:
> David S. Miller <davem@xxxxxxxxxxxxx> wrote:
> >
> > I agree with the changes to add rcu_dereference() use.
> > Those were definitely lacking and needed.
>
> Actually I'm not so sure that they are all needed. I only looked
> at the very first one in the patch which is in in_dev_get(). That
> one certainly isn't necessary because the old value of ip_ptr
> is valid as long as the reference count does not hit zero.
>
> The later is guaranteed by the increment in in_dev_get().
>
> Because the pervasiveness of reference counting in the network stack,
> I believe that we should scrutinise the other bits in the patch too
> to make sure that they are all needed.
>
> In general, using rcu_dereference/rcu_assign_pointer does not
> guarantee correct code. We really need to look at each case
> individually.
Yep, these two APIs are only part of the solution.
The reference-count approach is only guaranteed to work if the kernel
thread that did the reference-count increment is later referencing that
same data element. Otherwise, one has the following possible situation
on DEC Alpha:
o CPU 0 initializes and inserts a new element into the data
structure, using rcu_assign_pointer() to provide any needed
memory barriers. (Or, if RCU is not being used, under the
appropriate update-side lock.)
o CPU 1 acquires a reference to this new element, presumably
using either a lock or rcu_read_lock() and rcu_dereference()
in order to do so safely. CPU 1 then increments the reference
count.
o CPU 2 picks up a pointer to this new element, but in a way
that relies on the reference count having been incremented,
without using locking, rcu_read_lock(), rcu_dereference(),
and so on.
This CPU can then see the pre-initialized contents of the
newly inserted data structure (again, but only on DEC Alpha).
Again, if the same kernel thread that incremented the reference count
is later accessing it, no problem, even on Alpha.
Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/