The stack trace from the Oops looks _very_ strange!
I don't believe that vmalloc _ever_ calls sys_init_module!!!
If you look at the call stack from the bottom up, it looks as if
the Oops happens when sys_init_module calls the init_module
function in the module, but the stack trace seems to have been mangled.
If we can rule out any mishaps with the "latest version"(?) of
__generic_memcpy_tofs, a (slightly) possible explanation could be that =
the
module has become unloaded (by kerneld) and that a reference to
unallocated memory was being made. I still really don't believe that
the sys_init_module has _anything_ to do with the Oops...
But, with kerneld's frequent (re-)loading and unloading of modules we w=
ill
see effects of lacking MOD_INC_USE_COUNT in some modules, the hard way.=
..
> : Maybe the sound modul=
e was
> : busy being loaded when kerneld started loading it again, or somethi=
ng
> : like this. Bj=F6rn, is there anything that protects against that ki=
nd of
> : "re-entrancy" problem?
>=20
> No, current module stuff does no locking and even do not try to resol=
ve
> any race conditions. I encountered this problem very long ago.
Well, kerneld doesn't protect from this, but insmod and the kernel does=
...
If a module is loaded, or is in the process of being loaded, there is a
check in linux/kernel/module.c that looks if there already exists a mod=
ule
with the same name.
This check is made already in the first step of loading a module, i.e.
sys_create_module(), where the name of the module-to-be is also stored.
So, there _is_ a locking mechanism for handling "re-entrant" module loa=
ds.
The only "duplicate" check in kerneld is for "request-route" for the
same IP-adress.
There shouldn't be yet another check in kerneld for a thing that the
kernel itself already checks for.
> F.e. when amd starts, it mounts a lot of NFS file systems,
> and it starts happily only in 50% cases, is NFS is module.
> Moreover, I will eat my hat, if someone will able to make kerneld
> truly reliable without complete rewrite of module.c.
Do you favour any particular seasoning, or do you eat the hat "au natur=
el"? :-)
>=20
> It was the main reason, why I rewrote all the module stuff
> from the scratch. Unfortunately, it is orthogonal to standard
> module implementation.
Definitely...
The symbol versioning got lost, along with the support for different
module sets that are now implemented in /lib/modules.
You have also removed the other uses for kerneld in the process,
and put a suboptimal insmod in kernel space. It doesn't belong there..=
.
There _are_ a couple of things left to do with kerneld, especially with
handling return messages to requests generated during interrupts,
so you don't have to hurry to select _what_ hat to use just yet... :-)
Bjorn <bj0rn@blox.se>