Re: why is the sys_close symbol exported ?

From: One Thousand Gnomes
Date: Fri Nov 18 2016 - 08:49:59 EST


On Fri, 18 Nov 2016 09:56:52 +0100
jmfriedt <jmfriedt@xxxxxxxxxxx> wrote:

> Following the various rootkit and system call redirection developments, the current
> way of identifying the location of the system call table seems to be brute force scanning
> the memory for the location of one of the system calls. This is only possible from a module
> if the symbol is exported: I see that only one system call symbol is still exported, that
> is sys_close. Removing this symbol export would hinder one of the ways of finding the
> systam call table: I have not been able to find the reason for exporting this particular
> symbol (while sys_open for example is not exported). Can anyone justify why that is ?
>
> Thank you, Jean-Michel
>

find . -name "*.[ch]" -exec grep -H sys_close {} \;

So currently it is needed by autofs4, binfmt_misc, net/kcm/kcmsock and
those look like legitimate use cases.

It might be worth changing sys_close to just wrap a call to
do_sys_close() which is the existing code. That would make it slightly
harder.

That said anyone doing syscall table scanning that would can do it at
least two other pretty reliable ways by working form the system call
entry point, which is trivially discoverable.

It's one reason I'd really like to see kvm/qemu provide 'read only until
the virtual machine exits' memory range so that you can irrevocably
protect page ranges (like the syscalls and much of the kernel code)
within a VM.

Alan