[PATCH v2] kallsyms: Fix wrong "big" kernel symbol type read from procfs

From: Zheng Yejian
Date: Fri Oct 11 2024 - 02:38:45 EST


Currently when the length of a symbol is longer than 0x7f characters,
its type shown in /proc/kallsyms can be incorrect.

I found this issue when reading the code, but it can be reproduced by
following steps:

1. Define a function which symbol length is 130 characters:

#define X13(x) x##x##x##x##x##x##x##x##x##x##x##x##x
static noinline void X13(x123456789)(void)
{
printk("hello world\n");
}

2. The type in vmlinux is 't':

$ nm vmlinux | grep x123456
ffffffff816290f0 t x123456789x123456789x123456789x12[...]

3. Then boot the kernel, the type shown in /proc/kallsyms becomes 'g'
instead of the expected 't':

# cat /proc/kallsyms | grep x123456
ffffffff816290f0 g x123456789x123456789x123456789x12[...]

The root cause is that, after commit 73bbb94466fd ("kallsyms: support
"big" kernel symbols"), ULEB128 was used to encode symbol name length.
That is, for "big" kernel symbols of which name length is longer than
0x7f characters, the length info is encoded into 2 bytes.

kallsyms_get_symbol_type() expects to read the first char of the
symbol name which indicates the symbol type. However, due to the
"big" symbol case not being handled, the symbol type read from
/proc/kallsyms may be wrong, so handle it properly.

Cc: stable@xxxxxxxxxxxxxxx
Fixes: 73bbb94466fd ("kallsyms: support "big" kernel symbols")
Signed-off-by: Zheng Yejian <zhengyejian@xxxxxxxxxxxxxxx>
---

v1 -> v2:
- Add reproduction info into commit message to make it clearer;
- Add cc: stable line;

v1: https://lore.kernel.org/all/20240830062935.1187613-1-zhengyejian@xxxxxxxxxxxxxxx/

kernel/kallsyms.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index a9a0ca605d4a..9e4bf061bb83 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -103,8 +103,11 @@ static char kallsyms_get_symbol_type(unsigned int off)
{
/*
* Get just the first code, look it up in the token table,
- * and return the first char from this token.
+ * and return the first char from this token. If MSB of length
+ * is 1, it is a "big" symbol, so needs an additional byte.
*/
+ if (kallsyms_names[off] & 0x80)
+ off++;
return kallsyms_token_table[kallsyms_token_index[kallsyms_names[off + 1]]];
}

--
2.25.1