Re: [RFC] change in /proc/devices

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Jan 25 2000 - 08:36:02 EST


On Tue, 25 Jan 2000, Alan Cox wrote:

> > > I think perhaps we need a module_lock that forbids just module load/unload
> > > when its held. A reader/writer lock with the unload as the writer and
> > > the rest as readers ?
> >
> > IMO I have a better variant. I'm doing an equivalent of routing for the
> > device numbers. I.e. maintain a tree of devno blocks. The lowest layer
>
> You are solving only a tiny fraction of the problem. The block/char device
> case is far from the entire problem that solving the module problem seperately
> might be a good idea ?

        Alan, do you really want to touch the whole mess now? I _know_
that it's a tiny piece of problem. But I also know what it will take to
fix. Look: we can't cover all places where we use module-provided data -
too wide area, too many entry points, too large overhead even for
read_lock() on each entry. We need a subset - those places where we can go
with module reference counter being zero. Now, the things become really
interesting. This area (let's call it A) is still too large. We need to
make it smaller. We _definitely_ don't want it going into the modules if
that can be avoided.
        Example 1. Currently opening a device (provided that module is
in-core) goes like that:
        blkdev_open() finds the methods
        it calls ->open() (foo_open())
        after a while, foo_open() either increments module counter and
returns 0 or returns an error.
See what is broken here? Yup, we are relying on the foo_open() taking
care to increment the counter before anything that might block.
Correct solution: make blkdev_open() increment the counter _and_ decrement
it again if ->open() returns non-zero. But that requires an information
about the module that gave us the methods. I've done it in the CIDR patch.

        Example 2. read_super() / foo_read_super(). Same story, same
solution. But it means that file_system_type gets a new field - pointer to
module.

        And there is a lot of other places. Now, provided that we covered
them and got the area down to something reasonable, we can do the thing.
Consider the places where we unregister module-provided data structures
(i.e. several spots inside unregister_foo()). Call that area B. What we
need is
        1) make the module unloading mark the module as being unloaded.
In the very beginning.
        2) add UNSAFE_MOD_INC_USE_COUNT (bletch). It should be atomic wrt
the (1) and fail if the module is already marked as being unloaded. Call
it instead of MOD_INC_USE_COUNT if the counter may be zero (~= in A).
        3) guarantee exclusion between B and A.

So there. It can be done, but that will be a lot of work and several
changes of data structures. I don't think that it can go before 2.5. What
_can_ be done is making A shrink, preferably moving it completely into
the kernel proper. And that will take a while.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:14 EST