Re: [PATCH 2/2] module: add support to avoid duplicates early on load

From: Luis Chamberlain
Date: Thu May 25 2023 - 14:23:00 EST


On Thu, May 25, 2023 at 05:42:10PM +0100, Greg KH wrote:
> Luis, I asked last time what modules are being asked by the kernel to be
> loaded thousands of times at boot and can't seem to find an answer
> anywhere, did I miss that?

Yes you missed it, I had explained it:

https://lore.kernel.org/all/ZEGopJ8VAYnE7LQ2@xxxxxxxxxxxxxxxxxxxxxx/

"My best assessment of the situation is that each CPU in udev ends up
triggering a load of duplicate set of modules, not just one, but *a
lot*. Not sure what heuristics udev uses to load a set of modules per
CPU."

Petr Pavlu then finishes the assessment:

https://lore.kernel.org/all/23bd0ce6-ef78-1cd8-1f21-0e706a00424a@xxxxxxxx/

But let me quote it, so it is not missed:

"My understanding is that udev workers are forked. An initial kmod
context is created by the main udevd process but no sharing happens
after the fork. It means that the mentioned memory pool logic doesn't
really kick in.

Multiple parallel load requests come from multiple udev workers, for
instance, each handling an udev event for one CPU device and making the
exactly same requests as all others are doing at the same time.

The optimization idea would be to recognize these duplicate requests at
the udevd/kmod level and converge them."

> This should be very easy to handle in
> userspace if systems need it, so that begs the questions, what types of
> systems need this?

I had explained, this has existed for a long time.

> We have handled booting with tens of thousands of
> devices attached for decades now with no reports of boot/udev/kmod
> issues before, what has recently changed to cause issues?

Doesn't mean this didn't happen before, just because memory is freed due
to duplicates does not mean that the memory pressure induced by them is
not stupid. It is stupid, but hasn't come up as a possible real issue
nowadays where systems require more vmalloc space used during boot with
new features. I had explained also the context where this came from.
David Hildenbrand had reported failure to boot on many CPUs. If you
induce more vmap memory pressure on boot with multiple CPUs eventually
you can't boot. Enabling KASAN will make this worse today.

Luis