Re: [PATCH] modpost: Optimize symbol search from linear to binary search

From: Jack Brennen
Date: Sat Sep 23 2023 - 13:21:38 EST


Mainly just clarifying that we're willing to change the behavior in corner
cases? The existing behavior has one set of quirks about which symtab entry
is returned, and your proposed behavior has another different set of quirks.

It's probably OK to break builds which have an undocumented assumption
about the order of symtab entries; personally, I'd rather not risk that myself,
but if somebody with more experience is willing to back that decision, I'm
OK with it.

Also, there's an alternative approach that uses bsearch from the standard
library and a common comparison function between qsort and bsearch. I
considered this alternative earlier; maybe you would prefer it because it
eliminates having to reimplement a binary search algorithm.
I chose not to do it this way because of trying to duplicate the quirks.
If no duplication of the quirks is needed, this becomes easier.

The idea for that is to build a sorted array of syminfo that look like this:

(section_index, addr_lo, addr_hi, sym_lo, sym_hi)

What this represents is the situation where for any lookup in the
range from (section_index, addr_lo) to (section_index, addr_hi)
inclusive, the nearest symbol will be either sym_lo or sym_hi.
There are four different meanings for (sym_lo, sym_hi):

(sym_lo = 0)
This is a placeholder for a duplicated address, and it cannot
compare equal to anything. After we sort the array, we set all
of the duplicated addresses except for the last one to sym_lo = 0.

(sym_lo = MAX_SYM)
This is used to designate an address being looked up. When this
is seen, it compares equal to any other syminfo that has an
overlapping range.

(sym_lo != 0, sym_hi = 0)
This represents the last range in a section. There's no following
address that could match. Should also have addr_hi = MAX.

(sym_lo != 0, sym_hi != 0)
This represents a range in a section that's not the last range.
sym_hi may be usable to satisfy the lookup, but only if it's
closer than sym_lo and if allow_negative is true. Note that
the address of sym_hi will be addr_hi+1, so we don't need any
additional code to fetch that address.

Here's a sample comparison function:
int syminfo_compare(const void *a, const void *b) {
const struct syminfo *sym1 = a;
const struct syminfo *sym2 = b;

if (sym1->section_index > sym2->section_index)
return 1;
if (sym1->section_index < sym2->section_index)
return -1;
if ((sym1->sym_lo == MAX_SYM && sym2->sym_lo != 0) ||
(sym2->sym_lo == MAX_SYM && sym1->sym_lo != 0)) {
/* Overlap is equality - test for it */
if (sym1->addr_hi >= sym2->addr_lo &&
sym2->addr_hi >= sym1->addr_lo) {
return 0;
}
/* No overlap, fall through */
}
if (sym1->addr_lo > sym2->addr_lo)
return 1;
if (sym1->addr_lo < sym2->addr_lo)
return -1;
/* Note that if we are comparing a lookup (MAX_SYM) with
a placeholder (0), the lookup always compares greater.
This causes us to search to the "right" of the placeholder
for a match, which is what we want. */
if (sym1->sym_lo > sym2->sym_lo)
return 1;
if (sym1->sym_lo < sym2->sym_lo)
return -1;
return 0;
}

So this greatly simplifies the back-end searching. It's a bsearch()
which gives you either a miss, or one or two alternatives for the result.
On the front end, you have an extra step after sorting which massages the
search array into the right configuration. There's actually not much code
needed to do that.

Is that of interest? The leveraging of bsearch() in that way?

A few other responses below...

On Sat, Sep 23, 2023 at 4:50 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
> > +/* Populate the search array that we just allocated.
> > + * Be slightly paranoid here. If the ELF file changes during processing,
>
> I could not understand. In which case, the ELF file changes?
>
> modpost loads the entire file to memory first..
>
> In which scenario, the memory content changes?
>

modpost doesn't load the entire file, it uses mmap to map it into the
address space.The easiest way to imagine this being an issue is that some
buggy parallelization happens and something is modifying vmlinux.o while
modpost is processing it. Of course it's probably acceptable to say,
"Don't do that!"

There are two alternatives here: actually read in the entire file, which
is certainly suboptimal, or just live with the fact that mmap makes no
guarantees about whether changes in the file are reflected in the memory map.

> > + /* A bit of paranoia; make sure that the end sentinel's address is
> > + * different than its predecessor. Not doing this could cause
> > + * possible undefined behavior if anybody ever inserts a symbol
> > + * with section_index and addr both at their max values.
>
> I could not understand this comment.
>
> If section_index and addr both at their max values at [table_size - 2],
> ->table[table_size - 2].addr + 1 wraps to zero.
>
> The table is not sorted any longer?
>

That's correct, the table would not be sorted any longer. But by design,
we never do a relational comparison against the end sentinel. But if
we rework the search, either by your suggestion or by using bsearch(),
this sentinel goes away.