Re: Module loading problem since 5.3

From: Matthias Maennich
Date: Thu Oct 24 2019 - 05:22:20 EST


On Wed, Oct 23, 2019 at 12:35:51PM +0000, Luis Chamberlain wrote:
On Wed, Oct 23, 2019 at 11:49:40AM +0100, Matthias Maennich wrote:
On Fri, Oct 18, 2019 at 12:18:48PM +0000, Luis Chamberlain wrote:
> On Wed, Oct 16, 2019 at 02:37:10PM +0100, Matthias Maennich wrote:
> > On Wed, Oct 16, 2019 at 12:50:30PM +0000, Luis Chamberlain wrote:
> > > On Mon, Oct 14, 2019 at 03:44:40PM +0100, Matthias Maennich wrote:
> > > > Hi Luis!
> > > >
> > > > On Mon, Oct 14, 2019 at 08:52:35AM +0000, Luis Chamberlain wrote:
> > > > > On Fri, Oct 11, 2019 at 09:26:05PM +0200, Heiner Kallweit wrote:
> > > > > > On 10.10.2019 19:15, Luis Chamberlain wrote:
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Oct 10, 2019, 6:50 PM Heiner Kallweit <hkallweit1@xxxxxxxxx <mailto:hkallweit1@xxxxxxxxx>> wrote:
> > > > > > >
> > > > > > > Â ÂMODULE_SOFTDEP("pre: realtek")
> > > > > > >
> > > > > > > Are you aware of any current issues with module loading
> > > > > > > that could cause this problem?
> > > > > > >
> > > > > > >
> > > > > > > Nope. But then again I was not aware of
> > > > > > > MODULE_SOFTDEP(). I'd encourage an extension to
> > > > > > > lib/kmod.c or something similar which stress tests this.
> > > > > > > One way that comes to mind to test this is to allow a
> > > > > > > new tests case which loads two drives which co depend on
> > > > > > > each other using this macro. That'll surely blow things
> > > > > > > up fast. That is, the current kmod tests uses
> > > > > > > request_module() or get_fs_type(), you'd want a new test
> > > > > > > case with this added using then two new dummy test
> > > > > > > drivers with the macro dependency.
> > > > > > >
> > > > > > > If you want to resolve this using a more tested path,
> > > > > > > you could have request_module() be used as that is
> > > > > > > currently tested. Perhaps a test patch for that can rule
> > > > > > > out if it's the macro magic which is the issue.
> > > > > > >
> > > > > > > Â Luis
> > > > > >
> > > > > > Maybe issue is related to a bug in introduction of symbol namespaces, see here:
> > > > > > https://lkml.org/lkml/2019/10/11/659
> > > > >
> > > > > Can you have your user with issues either revert 8651ec01daed or apply the fixes
> > > > > mentioned by Matthias to see if that was the issue?
> > > > >
> > > > > Matthias what module did you run into which let you run into the issue
> > > > > with depmod? I ask as I think it would be wise for us to add a test case
> > > > > using lib/test_kmod.c and tools/testing/selftests/kmod/kmod.sh for the
> > > > > regression you detected.
> > > >
> > > > The depmod warning can be reproduced when using a symbol that is built
> > > > into vmlinux and used from a module. E.g. with CONFIG_USB_STORAGE=y and
> > > > CONFIG_USB_UAS=m, the symbol `usb_stor_adjust_quirks` is built in with
> > > > namespace USB_STORAGE and depmod stumbles upon this emitting the
> > > > following warning (e.g. during make modules_install).
> > > >
> > > > depmod: WARNING: [...]/uas.ko needs unknown symbol usb_stor_adjust_quirks

But this was an issue only when the symbol namespace stuff was used?
Or do we know if it regressed other generic areas of the kernel?

The only known regression was caused by the changed ksymtab entry name
as pointed out above. (Userland) tools depending on that representation
might report issues. That is what [1] addresses by not requiring that
name change any longer and reverting to the previous scheme.


> > > > As there is another (less intrusive) way of implementing the namespace
> > > > feature, I posted a patch series [1] on last Thursday that should
> > > > mitigate the issue as the ksymtab entries depmod eventually relies on
> > > > are no longer carrying the namespace in their names.
> > > >
> > > > Cheers,
> > > > Matthias
> > > >
> > > > [1] https://lore.kernel.org/lkml/20191010151443.7399-1-maennich@xxxxxxxxxx/
> > >
> > > Yes but kmalloc() is built-in, and used by *all* drivers compiled as
> > > modules, so why was that not an issue?
>
> > In ksymtab, namespaced symbols had the format
> >
> > __ksymtab_<NAMESPACE>.<symbol>
> >
> > while symbols without namespace would still use the old format
> >
> > __ksymtab_<symbol>
>
> Ah, I didn't see the symbol namespace patches, good stuff!
>
> > These are also the names that are extracted into System.map (using
> > scripts/mksysmap). Depmod is reading the System.map and for symbols used
> > by modules that are in a namespace, it would not find a match as it does
> > not understand the namespace notation. Depmod would still not emit a
> > warning for symbols without namespace as their format did not change.

Now that I reviewed the symbol namespace implementation, and its
respective new fixes, it would seem to me that the issue is an after
thought issue with old userspace tools not being able to grock a new
expected format for symbol namespaces, and so with old kmod you'd run
into the depmod warning any time symbol namespaces are used.

Is that correct?

If so, I can't see how this issue could affect the reported issue in
this thread, where folks seem to be detecting a regression where a
module dependency is not being loaded. That is, I don't see how the
symbol namespace stuff could regress existing older symbols, specially
if the EXPORT_SYMBOL_NS() stuff is not used yet.

If this is correct the issue reported with r8169 may be different,
unless the implementation had some side consequences or issues which
we may not yet be aware of.


I don't disagree that the issue that started the thread could be caused
by a different problem. I was merely responding to the question how to
reproduce the outstanding issues in the symbol namespaces that caused
depmod to emit a warning.

Having the user with what may be a regression with r8169 and module
dependency loading try to revert 8651ec01daed would be good to see if
the issue goes away.

> Can we have a test case for this to ensure we don't regress on this
> again? Or put another way, what test cases were implemented for symbol
> namespaces?

While modpost and kernel/module.c are the tests at build and runtime
resp. to enforce proper use of symbol namespaces,

Well clearly it can also be buggy :)

Again, not disagreeing.


I could imagine to test for the proper layout in the ksymtab entries

Do we not have this already done at compile time?

Modpost (now) depends on the proper layout to validate namespaces at
modpost time. But that does not guard against e.g. growth of that entry.


(
note, as mentioned
earlier there are some fixes in flight to finalize the layout).

Reviewed now, thanks for the lore URL reference!

In addition, I could imagine adding a test that tries to load a module
that uses symbols from a namespace without importing it. The kernel
should deny loading or complain about it (depending on the
configuration). These are also some of the test cases I had when working
on that feature. I did not implement these as automated tests though. I
will put that on my list but help with that would be very welcome.

Happy to help with that, sure. Now that I grok the namespace kmod issue,
indeed tools/testing/selftests/kmod/kmod.sh and lib/test_kmod.c could be
extended with a new test case for namespaces. Two demo test drivers
would be written which allow for testing the different cases. Let me
know if the suggestion is unclear or if you have any questions about the
code.

I would like to defer this work until the fixes are in. That will
hopefully be -rc5. One additional test case could be to check that the
symbol namespaces required by the module's symbol use are consistent
with the declared imports via modinfo.

Thanks for your input!

Cheers,
Matthias


Luis