Re: [RFC] Provide in-kernel headers for making it easy to extend the kernel

From: Daniel Colascione
Date: Wed Mar 06 2019 - 19:33:24 EST


On Wed, Mar 6, 2019 at 4:07 PM H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>
> On 3/6/19 3:37 PM, Daniel Colascione wrote:
> >
> > I just don't get the opposition to Joel's work. The rest of the thread
> > already goes into detail about the problems with pure-filesystem
> > solutions, and you and others are just totally ignoring those
> > well-thought-out rationales for the module approach and doing
> > inflooping on "lol just use a tarball". That's not productive.
> >
> > Look; here's the bottom line: without this work, doing certain kinds
> > of system tracing is a nightmare, and with this patch, it Just Works.
> > You're arguing that various tools should do a better job of keeping
> > the filesystem in sync with the kernel. Maybe you're right. But we
> > don't live in a world where they will, because if this coherence were
> > going to happen, it'd work already. But this work solves the problem:
> > by necessity, anything that changes a kernel image *must* update
> > modules coherently, whether the kernel image and module come from the
> > filesystem, network boot, some kind of SQL database, or carrier
> > pigeon. There's nothing wrong with work that very cheaply makes the
> > kernel self-describing (introspection is elegant) and that takes
> > advantage of *existing* kernel tooling infrastructure to transparently
> > do a new thing.
> >
> > You don't have to use this patch if you don't want to. Please stop
> > trying to block it.
> >
>
> No, that's not how this works. It is far worse to do something the wrong
> way than not doing it at all, when it affects the kernel-user space
> interactions.

And what are the supposedly disastrous consequences of this change?
It's basically a souped-up version /proc/config.gz. Tell me more about
the trail of destruction and regret behind /proc/config.gz.

> Experience -- and we have almost 30 years of it -- has shown that hacks
> of this nature become engrained and all of a sudden is "mandatory". At
> the *very least* it needs to comply with the zero-one-infinity rule
> rather than being yet another ad hoc hack.

It already satisfies the zero-one-infinity rule by virtue of not being
a system for encoding an arbitrary number of random kernel header
blobs for some reason in a single kernel.

> More fundamentally, we already have multiple ways to handle objects that
> need to go into the filesystem: they can be installed with (or as)
> modules, they can use the firmware interface, and so on.

*There* *may* *be* *no* *filesystem*. Or the filesystem may be
read-only. The only thing the kernel can really guarantee is its own
existence --- it should be entire in itself. If I'm hacking on an
Android kernel and say "fastboot boot mykernel" without making any
changes to the device's boot filesystem, I should still be able to use
tracing tools that rely on knowing the headers for the kernel with
which the device happened to boot. Any approach that requires
coordinated kernel and filesystem changes to make this usecase work is
inferior to what Joel's proposed.

> Saying "it can be a module" is worse than a copout: even if dynamically
> loaded -- and many setups lock out post-boot module loadings for
> security reasons -- there is nothing to cause it to unload.

Those setups can ship kernel headers as they do today. Or a
tmpfs-based approach may be workable.

> The bottom line is that in the end there is no difference between making
> this an archive of some sort and a module, except that to use the module
> you need privilege that you otherwise may not need. If your argument is
> that someone may not be providing the whole set of items provided by
> "make modules_install", what is there to say that they would include
> your specific module?
>
> Here are some better ways of implementation I can see:
>
> 1. Include an archive in "make modules_install". Most trivial; no kernel
> changes at all.

No. See above.

> 2. Generalize the initramfs code to be able to create a pre-populated
> tmpfs at any time, that can be populated from an archive provided by
> the firmware loading mechanism; like all firmware allows it to either
> be built in or fetched from the filesystem. This allows it to be
> built in to the kernel image if that becomes necessary; using tmpfs
> means that it can be pushed out to swap rather than permanently
> stored in kernel memory, and this filesystem can be unmounted freeing
> its memory.

Backing the blob storage with tmpfs is a reasonable tweak to Joel's
existing model. We can mark the header blob discardable and memcpy it
into some tmpfs-backed storage. This way, it can swap, and you can
release the memory with rm(1) as well as unmount. You might as well
expose the facility as a new just-like-tmpfs filesystem that init
scripts can mount anywhere --- once. Making the thing a firmware blob
sounds fine too, although I know less about that subsystem. But
blocking this work as a whole in favor of some yet-to-be-designed
general-purpose initramfs-tmpfs conversion thingamajig really is
perfect-is-the-enemy-of-the-good-ism, and I don't think tmpfs storage
is necessary for the initial version of this work.

> 3. Use a squashfs image instead of an archive.

Why?