Re: [v3 00/39] faster tree-based sysctl implementation

From: Lucian Adrian Grijincu
Date: Mon May 23 2011 - 09:27:15 EST


On Mon, May 23, 2011 at 12:32 PM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> Mostly it is simplicity. ÂThere is also the fact that the spin lock is
> used in the implementation of something that is essentially a
> reader/writer lock already.


The amount of time in which the spin lock is held in the current
implementation can be quite large: in __register_sysctl_paths:

https://github.com/mirrors/linux-2.6/blob/v2.6.39/kernel/sysctl.c#L1887

spin_lock(&sysctl_lock);
for (set = header->set; set; set = set->parent)
list_for_each_entry(p, &set->list, ctl_entry)
try_attach(p, header);
spin_unlock(&sysctl_lock);


For N=10^5 headers and try_attach=O(N) it's not a very good locking mechanism.

That's why I opted for a rwlock for each dir's subdirs/tables.


> In that context I'm not certain I am comfortable with separating proc
> inode usage from other proc usage. But I haven't read through that
> section of your code well enough yet to tell if you are making sense.


Proc inode usage (->count) was already separate from other proc usage (->use).
It was not separate from other header references (shared in ->count).

I separated the two because when I call unregister on a header I need
to decide whether to really unregister it (->unregistering=true and no
one can see this header and anything under it any more) or just
decrement a reference.

In the current implementation a header is only created by a
__register_sysctl_paths call and it's clear that at unregister we have
to set ->unregistering.
In my implementation headers are created dynamically to create new
directory elements. I need to know when to unregister such a header
regardless of any possible procfs inode references.

https://github.com/luciang/linux-2.6-new-sysctl/blob/v4-new-sysctl-alg/kernel/sysctl.c#L2390



I pushed a new version:
git://github.com/luciang/linux-2.6-new-sysctl.git v4-new-sysctl-alg

I undid int->u8 for ctl_procfs_refs.

I left the ->permissions hook get it's namespace form current->
because rewriting history for that change trips on too many patches
and a new parameter can be very easily added later when needed. Hope
this is ok with you.


I'd like to send patches for review to archs/drivers/etc. that
register only tables of files, not whole sysctl trees.
The patches don't depend on anything from this series.

Examples:
* http://thread.gmane.org/gmane.linux.kernel/1137032/focus=1137089
* http://thread.gmane.org/gmane.linux.kernel/1137032/focus=1137087


I'd like an OK-GO from you.

--
Â.
..: Lucian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/