Re: [PATCH 1/2] module: Overwrite st_size instead of st_info

From: Dave Martin
Date: Fri Nov 23 2018 - 12:21:13 EST


On Thu, Nov 22, 2018 at 05:49:23PM +0000, Russell King - ARM Linux wrote:
> On Thu, Nov 22, 2018 at 06:40:45PM +0100, Ard Biesheuvel wrote:
> > On Thu, 22 Nov 2018 at 17:29, Jessica Yu <jeyu@xxxxxxxxxx> wrote:
> > >
> > > +++ Vincent Whitchurch [22/11/18 13:24 +0100]:
> > > >On Thu, Nov 22, 2018 at 12:01:54PM +0000, Dave Martin wrote:
> > > >> On Mon, Nov 19, 2018 at 05:25:12PM +0100, Vincent Whitchurch wrote:
> > > >> > st_info is currently overwritten after relocation and used to store the
> > > >> > elf_type(). However, we're going to need it fix kallsyms on ARM's
> > > >> > Thumb-2 kernels, so preserve st_info and overwrite the st_size field
> > > >> > instead. st_size is neither used by the module core nor by any
> > > >> > architecture.
> > > >> >
> > > >> > Signed-off-by: Vincent Whitchurch <vincent.whitchurch@xxxxxxxx>
> > > >> > ---
> > > >> > v4: Split out to separate patch. Use st_size instead of st_other.
> > > >> > v1-v3: See PATCH 2/2
> > > >> >
> > > >> > kernel/module.c | 4 ++--
> > > >> > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >> >
> > > >> > diff --git a/kernel/module.c b/kernel/module.c
> > > >> > index 49a405891587..3d86a38b580c 100644
> > > >> > --- a/kernel/module.c
> > > >> > +++ b/kernel/module.c
> > > >> > @@ -2682,7 +2682,7 @@ static void add_kallsyms(struct module *mod, const struct load_info *info)
> > > >> >
> > > >> > /* Set types up while we still have access to sections. */
> > > >> > for (i = 0; i < mod->kallsyms->num_symtab; i++)
> > > >> > - mod->kallsyms->symtab[i].st_info
> > > >> > + mod->kallsyms->symtab[i].st_size
> > > >> > = elf_type(&mod->kallsyms->symtab[i], info);
> > > >> >
> > > >> > /* Now populate the cut down core kallsyms for after init. */
> > > >> > @@ -4061,7 +4061,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> > > >> > kallsyms = rcu_dereference_sched(mod->kallsyms);
> > > >> > if (symnum < kallsyms->num_symtab) {
> > > >> > *value = kallsyms->symtab[symnum].st_value;
> > > >> > - *type = kallsyms->symtab[symnum].st_info;
> > > >> > + *type = kallsyms->symtab[symnum].st_size;
> > > >> > strlcpy(name, symname(kallsyms, symnum), KSYM_NAME_LEN);
> > > >> > strlcpy(module_name, mod->name, MODULE_NAME_LEN);
> > > >> > *exported = is_exported(name, *value, mod);
> > > >>
> > > >> This is fine if st_size is really unused, but how sure are you of that?
> > > >>
> > > >> grepping for st_size throws up some hits that appear ELF-related, but
> > > >> I've not investigated them in detail.
> > > >>
> > > >> (The fact that struct stat has an identically named field throws up
> > > >> a load of false positives too.)
> > > >
> > > >$ git describe --tags
> > > >v4.20-rc3-93-g92b419289cee
> > > >
> > > >$ rg -m1 '[\.>]st_size' --iglob '!**/tools/**' --iglob '!**/vdso*' --iglob '!**/scripts/**' --iglob '!**/usr/**' --iglob '!**/samples/**' | cat
> > > >| kernel/kexec_file.c: if (sym->st_size != size) {
> > > >
> > > >Symbols in kexec kernel.
> > > >
> > > >| fs/stat.c: tmp.st_size = stat->size;
> > > >| Documentation/networking/tls.txt: sendfile(sock, file, &offset, stat.st_size);
> > > >| net/9p/client.c: ret->st_rdev, ret->st_size, ret->st_blksize,
> > > >| net/9p/protocol.c: &stbuf->st_rdev, &stbuf->st_size,
> > > >| fs/9p/vfs_inode_dotl.c: i_size_write(inode, stat->st_size);
> > > >| fs/hostfs/hostfs_user.c: p->size = buf->st_size;
> > > >| arch/powerpc/boot/mktree.c: nblks = (st.st_size + IMGBLK) / IMGBLK;
> > > >| arch/alpha/kernel/osf_sys.c: tmp.st_size = lstat->size;
> > > >| arch/x86/ia32/sys_ia32.c: __put_user(stat->size, &ubuf->st_size) ||
> > > >
> > > >Not Elf_Sym.
> > > >
> > > >| arch/x86/kernel/machine_kexec_64.c: sym->st_size);
> > > >
> > > >Symbols in kexec kernel.
> > > >
> > > >| arch/sparc/boot/piggyback.c: st4(buffer + 12, s.st_size);
> > > >| arch/sparc/kernel/sys_sparc32.c: err |= put_user(stat->size, &statbuf->st_size);
> > > >| arch/um/os-Linux/file.c: .ust_size = src->st_size, /* total size, in bytes */
> > > >| arch/um/os-Linux/start_up.c: size = (buf.st_size + UM_KERN_PAGE_SIZE) & ~(UM_KERN_PAGE_SIZE - 1);
> > > >| arch/s390/kernel/compat_linux.c: tmp.st_size = stat->size;
> > > >| arch/arm/kernel/sys_oabi-compat.c: tmp.st_size = stat->size;
> > > >| arch/mips/boot/compressed/calc_vmlinuz_load_addr.c: vmlinux_size = (uint64_t)sb.st_size;
> > > >| drivers/net/ethernet/marvell/sky2.c: hw->st_idx = RING_NEXT(hw->st_idx, hw->st_size);
> > > >
> > > >Not Elf_Sym.
> > >
> > > [ added Miroslav to CC, just in case he would like to check :) ]
> > >
> > > I have just double checked as well, and am fairly certain that the
> > > Elf_Sym st_size field is not used to apply module relocations in any
> > > arches, and it is not used in the core module loader nor in the module
> > > kallsyms code. We'd like to avoid overwriting st_info in any case, to
> > > fix kallsyms on Thumb-2 and also so that livepatch won't run into any
> > > issues with delayed relocations, should livepatch support ever expand
> > > to arches (e.g., arm) that rely on st_info for module relocations.
> > >
> >
> > Also note that st_size cannot be relied upon in general, since we
> > overwrite the addresses of undefined symbols in a module's symbol
> > table when resolve them against ksyms, while the st_size field is kept
> > at 0. At relocation time, we don't really distinguish anymore between
> > local and external module symbols, and so relying on st_size to be
> > accurate would definitely break things.
>
> Umm.
>
> Undefined symbols of course have a zero size, because it's not known
> how large the definition that the symbol refers to is. However,
> the non-undefined symbols should have a size with them - anything
> emitted by the compiler should, but only if .size has been used in
> the assembler will assembly have correct sizes.

which I guess is one reason st_size is a bit unreliable.

For kallsyms purposes we assume that the size of each symbol is just the
offset from it to the next symbol, which seems good enough in practice.

> Aren't SHN_UNDEF symbols ignored, except when applying the
> relocations, where we rely on st_value to be set on SHN_UNDEF
> symbols?

Presumably, but it looks like st_size is pretty much don't care.

I renamed st_size in the struct in elf.h to a garbage name and the
kernel and modules still build fine, though I only tried this for arm64
so far.

This suggests that at runtime at least, there is no runtime usage of
this field at all unless it's in arch-specific code for other arches.
Others seem confident that it's not used anywhere, and although I've not
gone through all the grep hits, I've only encountered false positives so
far.

Cheers
---Dave