On Mar 8, 2018, at 9:08 PM, Alexei Starovoitov <ast@xxxxxx> wrote:
On 3/8/18 7:54 PM, Andy Lutomirski wrote:
On Mar 8, 2018, at 7:06 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
Honestly, that "read twice" thing may be what scuttles this.
Initially, I thought it was a non-issue, because anybody who controls
the module subdirectory enough to rewrite files would be in a position
to just execute the file itself directly instead.
On further consideration, I think thereâs another showstopper. This patch is a potentially severe ABI break. Right now, loading a module *copies* it into memory and does not hold a reference to the underlying fs. With the patch applied, all kinds of use cases can break in gnarly ways. Initramfs is maybe okay, but initrd may be screwed. If you load an ET_EXEC module from initrd, then umount it, then clear the ramdisk, something will go horribly wrong. Exactly what goes wrong depends on whether userspace notices that umount() failed. Similarly, if you load one of these modules over a network and then lose your connection, you have a problem.
there is not abi breakage and file cannot disappear from running task.
One cannot umount fs while file is still being used.
Sure it is. Without your patch, init_module doesnât keep using the
file, so itâs common practice to load a module and then delete or
unmount it. With your patch, the unmount case breaks. This is likely
to break existing userspace, so, in Linux speak itâs an ABI break.
The âread twiceâ thing is also bad for another reason: containers. Suppose I have a setup where a container can load a signed module blob. With the read twice code, the container can race and run an entirely different blob outside the container.
Not only "read twice", but "read many".
If .text sections of elf that are not yet in memory can be modified
by malicious user, later they will be brought in with different code.
I think the easiest fix to tighten this "umh modules" to CAP_SYS_ADMIN.
Given this issue, I think the patch would need Keesâs explicit ack. I
had initially thought your patch had minimal security impact, but I
was wrong Module security is very complicated and needs to satisfy a
bunch of requirements. There is a lot of code in the kernel that
assumes that itâs sufficient to verify a module once at load time,
your patch changes that, and this has all kinds of nasty interactions
with autoloading.
Kees is very reasonable, and heâll change his mind
and ack a patch that heâs nacked when presented with a valid technical
argument.
But I think my ABI break observation is also a major problem, and
Linus is going to be pissed if this thing lands in his tree and breaks
systems due to an issue that was raised during review. So I think you
need to either rework the patch or do a serious survey of how all the
distros deal with modules (dracut, initramfs-tools, all the older
stuff, and probably more) and make sure they can all handle your
patch. I'd also be concerned about anyone who puts /lib/modules on
less reliable storage than they use for the rest of their system. (I
know it's quite common to have /boot be the only non-RAID partition on
a system, but modules don't generally live in /boot.)
Also, I think you really ought to explain how your approach will work
with MODULES=n or convince Linus that itâs okay to start adding core
networking features that donât work with MODULES=n and can't be built
into the main kernel image.