Re: [PATCH v3 2/3] kernel/resource: disallow access to exclusive system RAM regions

From: Williams, Dan J
Date: Wed Sep 01 2021 - 15:37:43 EST


On Tue, 2021-08-31 at 22:21 +0200, David Hildenbrand wrote:
> virtio-mem dynamically exposes memory inside a device memory region as
> system RAM to Linux, coordinating with the hypervisor which parts are
> actually "plugged" and consequently usable/accessible. On the one hand, the
> virtio-mem driver adds/removes whole memory blocks, creating/removing busy
> IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs
> memory inside added memory blocks, dynamically either exposing them to
> the buddy or hiding them from the buddy and marking them PG_offline.
>
> In contrast to physical devices, like a DIMM, the virtio-mem driver
> is required to actually make use of any of the device-provided memory,
> because it performs the handshake with the hypervisor. virtio-mem memory
> cannot simply be access via /dev/mem without a driver.
>
> There is no safe way to:
> a) Access plugged memory blocks via /dev/mem, as they might contain
>    unplugged holes or might get silently unplugged by the virtio-mem
>    driver and consequently turned inaccessible.
> b) Access unplugged memory blocks via /dev/mem because the virtio-mem
>    driver is required to make them actually accessible first.
>
> The virtio-spec states that unplugged memory blocks MUST NOT be
> written, and only selected unplugged memory blocks MAY be read. We want
> to make sure, this is the case in sane environments -- where the
> virtio-mem driver was loaded.
>
> We want to make sure that in a sane environment, nobody "accidentially"
> accesses unplugged memory inside the device managed region. For example,
> a user might spot a memory region in /proc/iomem and try accessing it via
> /dev/mem via gdb or dumping it via something else. By the time the mmap()
> happens, the memory might already have been removed by the virtio-mem
> driver silently: the mmap() would succeeed and user space might
> accidentially access unplugged memory.
>
> So once the driver was loaded and detected the device along the
> device-managed region, we just want to disallow any access via
> /dev/mem to it.
>
> In an ideal world, we would mark the whole region as busy ("owned by a
> driver") and exclude it; however, that would be wrong, as we don't
> really have actual system RAM at these ranges added to Linux ("busy system
> RAM"). Instead, we want to mark such ranges as "not actual busy system RAM
> but still soft-reserved and prepared by a driver for future use."
>
> Let's teach iomem_is_exclusive() to reject access to any range
> with "IORESOURCE_SYSTEM_RAM | IORESOURCE_EXCLUSIVE", even if not busy
> and even if "iomem=relaxed" is set.
>
> For now, there are no applicable ranges and we'll modify virtio-mem next to
> properly set IORESOURCE_EXCLUSIVE on the parent resource container it
> creates to contain all actual busy system RAM added via
> add_memory_driver_managed().

Looks good,

Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx>