Re: [PATCH] exec: allow executing block devices

From: Kees Cook
Date: Tue Oct 10 2023 - 18:48:21 EST


On Tue, Oct 10, 2023 at 09:21:33AM +0000, Alyssa Ross wrote:
> As far as I can tell, the S_ISREG() check is there to prevent
> executing files where that would be nonsensical, like directories,
> fifos, or sockets. But the semantics for executing a block device are
> quite obvious — the block device acts just like a regular file.
>
> My use case is having a common VM image that takes a configurable
> payload to run. The payload will always be a single ELF file.
>
> I could share the file with virtio-fs, or I could create a disk image
> containing a filesystem containing the payload, but both of those add
> unnecessary layers of indirection when all I need to do is share a
> single executable blob with the VM. Sharing it as a block device is
> the most natural thing to do, aside from the (arbitrary, as far as I
> can tell) restriction on executing block devices. (The only slight
> complexity is that I need to ensure that my payload size is rounded up
> to a whole number of sectors, but that's trivial and fast in
> comparison to e.g. generating a filesystem image.)
>
> Signed-off-by: Alyssa Ross <hi@xxxxxxxxx>

Hi,

Thanks for the suggestion! I would prefer to not change this rather core
behavior in the kernel for a few reasons, but it mostly revolves around
both user and developer expectations and the resulting fragility.

For users, this hasn't been possible in the past, so if we make it
possible, what situations are suddenly exposed on systems that are trying
to very carefully control their execution environments?

For developers, this ends up exercising code areas that have never been
tested, and could lead to unexpected conditions. For example,
deny_write_access() is explicitly documented as "for regular files".
Perhaps it accidentally works with block devices, but this would need
much more careful examination, etc.

And while looking at this from a design perspective, it looks like a
layering violation: roughly speaking, the kernel execute files, from
filesystems, from block devices. Bypassing layers tends to lead to
troublesome bugs and other weird problems.

I wonder, though, if you can already get what you need through other
existing mechanisms that aren't too much more hassle? For example,
what about having a tool that creates a memfd from a block device and
executes that? The memfd code has been used in a lot of odd exec corner
cases in the past...

-Kees

--
Kees Cook