Re: [PATCH] exec: allow executing block devices

From: Alyssa Ross
Date: Wed Oct 11 2023 - 03:39:01 EST


Kees Cook <keescook@xxxxxxxxxxxx> writes:

> On Tue, Oct 10, 2023 at 09:21:33AM +0000, Alyssa Ross wrote:
>> As far as I can tell, the S_ISREG() check is there to prevent
>> executing files where that would be nonsensical, like directories,
>> fifos, or sockets. But the semantics for executing a block device are
>> quite obvious — the block device acts just like a regular file.
>>
>> My use case is having a common VM image that takes a configurable
>> payload to run. The payload will always be a single ELF file.
>>
>> I could share the file with virtio-fs, or I could create a disk image
>> containing a filesystem containing the payload, but both of those add
>> unnecessary layers of indirection when all I need to do is share a
>> single executable blob with the VM. Sharing it as a block device is
>> the most natural thing to do, aside from the (arbitrary, as far as I
>> can tell) restriction on executing block devices. (The only slight
>> complexity is that I need to ensure that my payload size is rounded up
>> to a whole number of sectors, but that's trivial and fast in
>> comparison to e.g. generating a filesystem image.)
>>
>> Signed-off-by: Alyssa Ross <hi@xxxxxxxxx>
>
> Hi,
>
> Thanks for the suggestion! I would prefer to not change this rather core
> behavior in the kernel for a few reasons, but it mostly revolves around
> both user and developer expectations and the resulting fragility.
>
> For users, this hasn't been possible in the past, so if we make it
> possible, what situations are suddenly exposed on systems that are trying
> to very carefully control their execution environments?

I expect very few, considering it's still necessary to have root chmod
the block device to make it executable.

> For developers, this ends up exercising code areas that have never been
> tested, and could lead to unexpected conditions. For example,
> deny_write_access() is explicitly documented as "for regular files".
> Perhaps it accidentally works with block devices, but this would need
> much more careful examination, etc.
>
> And while looking at this from a design perspective, it looks like a
> layering violation: roughly speaking, the kernel execute files, from
> filesystems, from block devices. Bypassing layers tends to lead to
> troublesome bugs and other weird problems.
>
> I wonder, though, if you can already get what you need through other
> existing mechanisms that aren't too much more hassle? For example,
> what about having a tool that creates a memfd from a block device and
> executes that? The memfd code has been used in a lot of odd exec corner
> cases in the past...

Is it possible to have a file-backed memfd? Strange name if so!

Attachment: signature.asc
Description: PGP signature