Re: [PATCH] module/kallsyms: sort function symbols and use binary search

From: Stanislaw Gruszka

Date: Wed Mar 25 2026 - 04:30:31 EST


On Tue, Mar 24, 2026 at 05:00:19PM +0100, Petr Pavlu wrote:
> On 3/24/26 1:53 PM, Stanislaw Gruszka wrote:
> > Hi,
> >
> > On Mon, Mar 23, 2026 at 02:06:43PM +0100, Petr Pavlu wrote:
> >> On 3/17/26 12:04 PM, Stanislaw Gruszka wrote:
> >>> Module symbol lookup via find_kallsyms_symbol() performs a linear scan
> >>> over the entire symtab when resolving an address. The number of symbols
> >>> in module symtabs has grown over the years, largely due to additional
> >>> metadata in non-standard sections, making this lookup very slow.
> >>>
> >>> Improve this by separating function symbols during module load, placing
> >>> them at the beginning of the symtab, sorting them by address, and using
> >>> binary search when resolving addresses in module text.
> >>
> >> Doesn't considering only function symbols break the expected behavior
> >> with CONFIG_KALLSYMS_ALL=y. For instance, when using kdb, is it still
> >> able to see all symbols in a module? The module loader should be remain
> >> consistent with the main kallsyms code regarding which symbols can be
> >> looked up.
> >
> > We already have a CONFIG_KALLSYMS_ALL=y inconsistency between kernel and
> > module symbol lookup, independent of this patch. find_kallsyms_symbol()
> > restricts the search to MOD_TEXT (or MOD_INIT_TEXT) address ranges, so
> > it cannot resolve data or rodata symbols.
>
> My understanding is that find_kallsyms_symbol() can identify all symbols
> in a module by their addresses. However, the issue I see with
> MOD_TEXT/MOD_INIT_TEXT is that the function may incorrectly calculate
> the size of symbols that are not within these ranges, which is a bug
> that should be fixed.

You are right, I misinterpreted the code:

if (within_module_init(addr, mod))
mod_mem = &mod->mem[MOD_INIT_TEXT];
else
mod_mem = &mod->mem[MOD_TEXT];

nextval = (unsigned long)mod_mem->base + mod_mem->size;

bestval = kallsyms_symbol_value(&kallsyms->symtab[best]);

For best = 0, bestval is also 0 as it comes from the ELF null symbol.

> A test using kdb confirms that non-text symbols can be found by their
> addresses. The following shows the current behavior with 7.0-rc5 when
> printing a module parameter in mlx4_en:
>
> [1]kdb> mds __param_arr_num_vfs
> 0xffffffffc1209f20 0000000100000003 ........
> 0xffffffffc1209f28 ffffffffc0fbf07c [mlx4_core]num_vfs_argc
> 0xffffffffc1209f30 ffffffff8844bba0 param_ops_byte
> 0xffffffffc1209f38 ffffffffc0fbf080 [mlx4_core]num_vfs
> 0xffffffffc1209f40 000000785f69736d msi_x...
> 0xffffffffc1209f48 656c5f6775626564 debug_le
> 0xffffffffc1209f50 00000000006c6576 vel.....
> 0xffffffffc1209f58 0000000000000000 ........
>
> .. and the behavior with the proposed patch:
>
> [1]kdb> mds __param_arr_num_vfs
> 0xffffffffc1077f20 0000000100000003 ........
> 0xffffffffc1077f28 ffffffffc104707c |p......
> 0xffffffffc1077f30 ffffffffb4a4bba0 param_ops_byte
> 0xffffffffc1077f38 ffffffffc1047080 .p......
> 0xffffffffc1077f40 000000785f69736d msi_x...
> 0xffffffffc1077f48 656c5f6775626564 debug_le
> 0xffffffffc1077f50 00000000006c6576 vel.....
> 0xffffffffc1077f58 0000000000000000 ........

Thanks for testing and pointing this out. Patch indeed breaks
the CONFIG_KALLSYMS_ALL case.

I think, possible fix would be to track the relevant sections in
__layout_sections() and use defined symbols from those sections,
instead of just function symbols.

Regards
Stanislaw