Re: [PATCH v8 2/3] modpost: Produce extended MODVERSIONS information
From: Matthew Maurer
Date: Mon Nov 18 2024 - 19:10:01 EST
> Thinking about this some more, if we're going down enabling a new
> option, it seems to beg the question if the old *two* ksymtab sections
> could just be folded into the a new one where the "gpl only" thing
> becomes just one "column" as you call it. Reasons I ask, it seems like
> we're duplicating symbol names on ksymtab and for modeversions. Could
> you review this a bit?
Short answer: We could do this, but I don't necessarily think it's a good idea.
ksymtab and modversions aren't duplicating names even with this patch
series - We have two different formats, one for importing symbols, and
one for exporting them. `__ksymtab`, `__ksymtab_gpl`, and
`__ksymtab_strings` are used to export symbols. `__versions` or the
new `__version_ext_names` and `__version_ext_crcs` are used to import
them. For this reason, in any given compilation unit, a string should
only appear either in the ksymtab (providing it), or in versions
(consuming it).
There also isn't as much immediate technical need for that kind of
rework of the ksymtab format - ksymtab uses a string table for their
names, so the "long name support" that extended modversions provides
to modversions is already present in ksymtab.
Combined, this means that there would be few technical benefits to
this - the primary potential benefit I could see to something like
this would be code complexity reduction, which is a bit of a matter of
personal taste, and mine might not match others'.
However, we could do some things similar to what's going on here:
A. We could try to unify versions and ksymtab (this seems most viable,
but the change in meaning of this data structure has me wary)
B. We could make ksymtab use columnar storage for more things - it
already does so for CRCs, we could theoretically make any or all of
licensing, namespaces, or symbol values columnar.
With the caveat that I am not convinced this restructuring is worth
the churn, the way I would do A would be:
1. Add a field to the `kernel_symbol` that indicates whether the
symbol is import/export (or possibly re-use `value` with a 0 value
after linker resolution to mean "import" instead of export).
2. Generate `kernel_symbol` entries for imported symbols, not just
exported ones.
3. Read `kcrctab` for import symbols to figure out what the expected
crc value is when importing, rather than using versions.
4. Stop generating/reading any of `__versions`, `__version_ext_names`,
`__versions_ext_crcs`, etc.
There are two downsides I can see to this:
1. You cannot make this backwards compatible with existing `kmod`.
(This was the argument given against just enlarging MODVERSIONS symbol
names.)
2. It's hard to be certain that we know about all users of `ksymtab`
in order to ensure they all know the new convention around imported vs
exported symbols.
I think that B would actually make things worse because symbols always
today always have a value, a namespace, a name, and a license. The
only thing that's optional is the CRC, and that's already columnar.
Making the other ones columnar would hurt locality. We'd still need
the strtab sections, or we'd end up with many copies of each
namespace, where today that should get deduped down by the linker.
Columns are good for things that are extensions, optional, or variable
length.
If there are other reasons *for* doing this that I'm not aware of,
what I'd do would be:
1. Use the name as the primary index, same as modversions.
2. Split each other piece into its own column, with a joint iterator.
3. Convert license into a column, with an enum value (currently only
fully exported or GPL).
4. Replace places in the coe where a `struct kernel_symbol *` is used
today with an iterator over the joint columns.
Again, to reiterate, I *do not* think that B is a good idea. A might
be, but the improvement seems sufficiently marginal to me that I don't
know if it's worth the churn.