Re: [v3 00/39] faster tree-based sysctl implementation

From: Eric W. Biederman
Date: Mon May 23 2011 - 05:32:23 EST

Lucian Adrian Grijincu <lucian.grijincu@xxxxxxxxx> writes:

> On Mon, May 23, 2011 at 7:27 AM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>> This patchset looks like it is deserving of some close scrutiny, and
>> not just the high level design overview I have given the previous
>> patches. ÂThis is going to be a busy week for me so I probably won't
>> get through all of the patches for a while.
> I have one more question. The current implementation uses a single
> sysctl_lock to synchronize all changes to the data structures.
> In my algorithm I change a few places to use a per-header read-write
> lock. Even though the code is organized to handle a per-header rwlock,
> the implementation uses a single global rwlock. In v2 I got rid of the
> rwlock and replaced the subdirs/files regular lists with rcu-protected
> lists and that's why I did not bother giving each header a rwlock.
> I have no idea how to use rcu with rbtree. Should I now give each
> header it's own lock to reduce contention?

I would only walk down that path if we can find some profile data
showing that the lock is where we are hot.

> I'm asking this because I don't know why the only is a global sysctl
> spin lock, when multiple locks could have been used, each to protect
> it's own domain of values.

Mostly it is simplicity. There is also the fact that the spin lock is
used in the implementation of something that is essentially a
reader/writer lock already.

With the help of the reference counts we block when we are unregistering
until there are no more users.

In that context I'm not certain I am comfortable with separating proc
inode usage from other proc usage. But I haven't read through that
section of your code well enough yet to tell if you are making sense.

One of the things that would be very nice to do is add lockdep
annotations like I have to sysfs_activate and sysfs_deactivate, so we
can catch the all too common case of someone unregistering a sysctl
table when there are problems.

Personally I'm not happy with the state of the locking abstractions in
sysctl today. It is all much too obscure, and there are too few
warnings. However for your set of changes I think the thing to focus
on is getting sysctl to better data structures so that it can scale.

Once the data structures are simple enough any remaining issues should
be fixable with small straight forward patches.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at