Re: [PATCH] modpost: support arbitrary symbol length in modversion

From: Andrea Righi
Date: Tue Mar 14 2023 - 10:38:33 EST


On Mon, Mar 13, 2023 at 11:09:31PM +0100, Andrea Righi wrote:
> On Mon, Mar 13, 2023 at 11:02:34PM +0100, Michal Suchánek wrote:
> > On Mon, Mar 13, 2023 at 10:53:34PM +0100, Andrea Righi wrote:
> > > On Mon, Mar 13, 2023 at 10:48:53PM +0100, Michal Suchánek wrote:
> > > > Hello,
> > > >
> > > > On Mon, Mar 13, 2023 at 09:32:16PM +0100, Andrea Righi wrote:
> > > > > On Wed, Jan 11, 2023 at 04:11:51PM +0000, Gary Guo wrote:
> > > > > > Currently modversion uses a fixed size array of size (64 - sizeof(long))
> > > > > > to store symbol names, thus placing a hard limit on length of symbols.
> > > > > > Rust symbols (which encodes crate and module names) can be quite a bit
> > > > > > longer. The length limit in kallsyms is increased to 512 for this reason.
> > > > > >
> > > > > > It's a waste of space to simply expand the fixed array size to 512 in
> > > > > > modversion info entries. I therefore make it variably sized, with offset
> > > > > > to the next entry indicated by the initial "next" field.
> > > > > >
> > > > > > In addition to supporting longer-than-56/60 byte symbols, this patch also
> > > > > > reduce the size for short symbols by getting rid of excessive 0 paddings.
> > > > > > There are still some zero paddings to ensure "next" and "crc" fields are
> > > > > > properly aligned.
> > > > > >
> > > > > > This patch does have a tiny drawback that it makes ".mod.c" files generated
> > > > > > a bit less easy to read, as code like
> > > > > >
> > > > > > "\x08\x00\x00\x00\x78\x56\x34\x12"
> > > > > > "symbol\0\0"
> > > > > >
> > > > > > is generated as opposed to
> > > > > >
> > > > > > { 0x12345678, "symbol" },
> > > > > >
> > > > > > because the structure is now variable-length. But hopefully nobody reads
> > > > > > the generated file :)
> > > > > >
> > > > > > Link: b8a94bfb3395 ("kallsyms: increase maximum kernel symbol length to 512")
> > > > > > Link: https://github.com/Rust-for-Linux/linux/pull/379
> > > > > >
> > > > > > Signed-off-by: Gary Guo <gary@xxxxxxxxxxx>
> > > > >
> > > > > Is there any newer version of this patch?
> > > > >
> > > > > I'm doing some tests with it, but I'm getting boot failures on ppc64
> > > > > with this applied (at boot kernel is spitting out lots of oops'es and
> > > > > unfortunately it's really hard to copy paste or just read them from the
> > > > > console).
> > > >
> > > > Are you using the ELF ABI v1 or v2?
> > > >
> > > > v1 may have some additional issues when it comes to these symbol tables.
> > > >
> > > > Thanks
> > > >
> > > > Michal
> > >
> > > I have CONFIG_PPC64_ELF_ABI_V2=y in my .config, so I guess I'm using v2.
> > >
> > > BTW, the issue seems to be in dedotify_versions(), as a silly test I
> > > tried to comment out this function completely to be a no-op and now my
> > > system boots fine (but I guess I'm probably breaking something else).
> >
> > Probably not. You should not have the extra leading dot on ABI v2. So if
> > dedotify does something that means something generates and then expects
> > back symbols with a leading dot, and this workaround for ABI v1 breaks
> > that. Or maybe it is called when it shouldn't.
>
> Hm.. I'll add some debugging to this function to see what happens exactly.

Alright I've done more tests across different architectures. My problem
with ppc64 is that this architecture is evaluating sechdrs[i].sh_size
using get_stubs_size(), that apparently can add some extra padding, so
doing (vers + vers->next < end) isn't a reliable check to determine the
end of the variable array, because sometimes "end" can be greater than
the last "vers + vers->next" entry.

In general I think it'd be more reliable to add a dummy NULL entry at
the end of the modversion array.

Moreover, I think we also need to enforce struct modversion_info to be
__packed, just to make sure that no extra padding is added (otherwise it
may break our logic to determine the offset of the next entry).

> @@ -2062,16 +2066,25 @@ static void add_versions(struct buffer *b, struct module *mod)
> s->name, mod->name);
> continue;
> }
> - if (strlen(s->name) >= MODULE_NAME_LEN) {
> - error("too long symbol \"%s\" [%s.ko]\n",
> - s->name, mod->name);
> - break;
> - }
> - buf_printf(b, "\t{ %#8x, \"%s\" },\n",
> - s->crc, s->name);
> + name_len = strlen(s->name);
> + name_len_padded = (name_len + 1 + 3) & ~3;
> +
> + /* Offset to next entry */
> + tmp = TO_NATIVE(8 + name_len_padded);

^ Here's another issue that I found, you can't use TO_NATIVE() in this
way, some compilers are complaining (like on s390x this doesn't build).

So we need to do something like:

/* Offset to next entry */
tmp = 8 + name_len_padded
tmp = TO_NATIVE(tmp);

I'll do some additional tests with these changes and send an updated
patch (for those that are interested).

-Andrea