Re: [RFC v2 0/6] Track RCU dereferences in RCU read-side critical sections
From: Peter Zijlstra
Date: Fri Feb 26 2016 - 06:25:36 EST
On Fri, Feb 26, 2016 at 11:06:27AM +0800, Boqun Feng wrote:
> On Thu, Feb 25, 2016 at 07:37:24AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 25, 2016 at 03:32:43PM +0100, Peter Zijlstra wrote:
> > > On Tue, Feb 16, 2016 at 01:57:39PM +0800, Boqun Feng wrote:
> > > > As a characteristic of RCU, read-side critical sections have a very
> > > > loose connection with rcu_dereference()s, which is you can only be sure
> > > > about an rcu_dereference() might be called in some read-side critical
> > > > section, but if code gets complex, you may not be sure which read-side
> > > > critical section exactly, this might be also an problem for some other
> > > > locking mechanisms, that is the critical sections protecting data and
> > > > the data accesses protected are not clearly correlated.
> > > >
> > > > In this series, we are introducing LOCKED_ACCESS framework and based on
> > > > which, we implement the RCU_LOCKED_ACCESS functionality to give us a
> > > > clear hint: which rcu_dereference() happens in which RCU read-side
> > > > critical section.
> > > >
> > > > After this series applied, and if CONFIG_RCU_LOCKED_ACCESS=y, the proc
> > > > file /proc/locked_access/rcu will show all relationships collected so
> > > > far for rcu_read_lock() and their friends and rcu_dereference*().
> > >
> > > But why !? What does this bring us, why do I want to even look at these
> > > patches?
> >
> > There were some complaints about the difficulty of figuring out what
> > was being protected by a given rcu_read_lock() in cases where the
> > corresponding rcu_dereference() is several function calls down, and
> > especially in cases where the function calls are via pointers.
> > These cases show up in a number of places, perhaps most prominently
> > in networking.
> > Boqun's patches therefore use lockdep to make an association between
> > each rcu_dereference() and the rcu_read_lock() protecting it.
> >
>
> Yes, this patch is aiming to provide some information about this,
> actually the background of this patchset is a discussion between Ingo
> and Paul last year:
>
> http://lkml.kernel.org/g/20150414102505.GA13015@xxxxxxxxx
>
> In that discussion, Ingo gave one example about how hard it is to figure
> out what a RCU read-side critical section is protecting. Ingo's proposal
> of solving this problem was to add an extra parameter for RCU related
> primitives, however, Paul seemed to have concerns about that approach,
> because there were many RCU users' code that would need to be modified
> and there were corner cases where the one extra parameter was not enough
> or not necessary.
>
> Therefore I tried to figure out a way for making the association of
> rcu_dereference() and rcu_read_lock() automatically without the
> modification of code of the RCU users. That's how this patchset
> comes.
So the problem I have with this is that your text didn't have a problem
statement, and if there's no problem, what the heck is all this code
for.
Secondly, I'm not sure a random list of data is going to help anybody.
The whole point of lockdep is that it tells you if you do broken. This
doesn't, and cannot, because it lacks information to verify against.
So I'm still not sure this is useful. Also, I would argue your code has
problems if you cannot even find your rcu_read_lock().
And yes, networking is a special snowflake, they very often push their
rcu_read_lock(),local_bh_disable(),preempt_disable() whatnot, all the
way out, because cycles.
Which makes it entirely impossible to tell what all it guards. That's a
very concious choice they make, but its a royal pain for a number of
things. I'm not sure we can do anything about it.
So no, I'm not convinced. Again, why is this useful?