Re: Problems with commit 'kallsyms: add support for relative offsets in kallsyms address table' (in mmotm)

From: Guenter Roeck
Date: Sun Jan 24 2016 - 12:06:30 EST


On 01/24/2016 12:21 AM, Ard Biesheuvel wrote:
On 24 January 2016 at 08:06, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
On 01/23/2016 10:10 PM, Ard Biesheuvel wrote:



On 24 jan. 2016, at 03:35, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On 01/23/2016 06:06 PM, Guenter Roeck wrote:
Hi,

I see runtime problems with the current mmotm branch. All qemu mips
targets
(32 and 64 bit, big and little endian) are stuck in boot after this
commit.

Bisect points to commit d13682e4d9d2 ("kallsyms: add support for
relative offsets
in kallsyms address table". Disabling CONFIG_KALLSYMS_BASE_RELATIVE
fixes the problem,
ie I can boot the image with qemu.

Bisect log is attached.

Playing with the problem, I found the following:

1) The problem is only seen with a toolchain using binutils 2.22, but
not
with a toolchain using binutils 2.25. The compiler configuration may
be
different for both toolchains.
2) Message "kallsyms failure: absolute symbol value 0xffffffff807afd14
out of range
in relative mode" (twice) when using the toolchain with binutils
2.22.
This does not cause the build to fail, though.
3) kallsyms_sym_address() parameter variable type is "int". In the
calling code,
the variable type used is "unsigned long". That has no impact on the
problem,
though.


An additional data point: When using the older toolchain, many symbols in
System.map
are marked "A".
ffffffff80100000 A _text
With the more recent toolchain, the same symbols are marked "T".
ffffffff80100000 T _text


Thanks for the analysis. It is surprising that the build does not fail
when this occurs, and the subsequent hangs themselves are probably caused by
missing kallsyms data.

Yes, I wondered why the build doesn't fail. Seems odd.

scripts/kallsyms.c ignores all A symbols except _text, which is actually a
relative symbol by nature so we can simply assume it is relative (i.e.,
override it as T)

Re x86_64 !SMP, any build time errors there as well? Likewise for sparc32?


Yes, same kind of errors for both. For x86_64/nosmp I also get the error
message
when using the Ubuntu native toolchain, so it doesn't seem to be (directly)
related to binutils 2.22 vs. 2.25 for that architecture.

Runtime behavior is a bit different for the different architectures.
x86_64 dies silently without any console output, mips just hangs,
and sparc32 gets a panic with NULL pointer access.
Of course, with missing kallsyms data all bets are off.


Thanks again, and sorry for the trouble,


No worries. Hope you'll get this sorted out.


OK, there's an additional issue in my latest version: the
kallsyms_relative_base value itself is not relocated.

If you have more time to burn on this, could you try the following on
top? (If not, that is also fine, I will look into it myself on Monday)

diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 5ab13394dfd9..0f43f0751d47 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -137,8 +137,10 @@ static int read_symbol(FILE *in, struct sym_entry *s)
sym++;

/* Ignore most absolute/undefined (?) symbols. */
- if (strcmp(sym, "_text") == 0)
+ if (strcmp(sym, "_text") == 0) {
_text = s->addr;
+ stype = 'T';
+ }
else if (check_symbol_range(sym, s->addr, text_ranges,
ARRAY_SIZE(text_ranges)) == 0)
/* nothing to do */;
@@ -406,7 +408,7 @@ static void write_src(void)

if (base_relative) {
output_label("kallsyms_relative_base");
- printf("\tPTR\t%#llx\n", relative_base);
+ printf("\tPTR\t_text - %#llx\n", _text - relative_base);
printf("\n");
}


Does not help.

Here is part of the problem. This is from a log message added to make_percpus_absolute().

Marking symbol 'B__bss_start' as absolute
Marking symbol '?__init_end' as absolute
Marking symbol 'D__nosave_begin' as absolute
Marking symbol 'D__nosave_end' as absolute
Marking symbol 'D__per_cpu_end' as absolute
Marking symbol 'D__per_cpu_load' as absolute
Marking symbol 'D__per_cpu_start' as absolute
Marking symbol '?__smp_locks' as absolute
Marking symbol '?__smp_locks_end' as absolute
Marking symbol 'Bempty_zero_page' as absolute

This is with x86_64/nosmp. At least some of those symbols don't really reflect
'percpu' values. Maybe the distinction between percpu and non-percpu variables
gets lost if SMP is not configured.

On top of that, older versions of binutils mark additional symbols as absolute,
even with x86_64.

ffffffff81a00000 A __end_rodata_hpage_align
ffffffff81b19000 A __vvar_page
ffffffff81d3d000 A _end

Hope this helps,
Guenter