[PATCH v2 2/3] devmem: Block mmap access when read/write access is restricted

From: Dan Williams
Date: Thu Apr 10 2025 - 21:24:38 EST


Back in 2022 Kees noted that he is able to mmap System RAM below 1MB
even with CONFIG_STRICT_DEVMEM=y [1]. That is allowed for x86 legacy
compatibility reasons for userspace that wants to read BIOS data
resident at that address. However, the expectation is that when
devmem_is_allowed() returns 2 that the access is redirected to return
zeroes.

That happens for the read()/write() case, but by code inspection for
mmap(), there is no restriction.

Now, the confidential x86 VM (CVM) use case wants to depend on
"devmem_is_allowed() == 2" guaranteeing that no mapping to potentially
encrypted memory is established [2]. The options to enable that are
teach mmap_mem() to meet the "zeroed buffer" implication of
devmem_is_allowed() returning "2", or return -EPERM for that case.

Return -EPERM on the hope that userspace does not actually depend on the
legacy behavior of being able to reliably map the first 1MB of memory on
x86. I.e. that all legacy cases are using read()/write() to safely read
zeroes. If that turns out not to be true then either a "map zeroes"
scheme can be added, or the CVM case can return 3 from
devmem_is_allowed() to hide the CVM restriction from legacy
environments.

Link: http://lore.kernel.org/CAPcyv4iVt=peUAk1qx_EfKn7aGJM=XwRUpJftBhkUgQEti2bJA@xxxxxxxxxxxxxx [1]
Link: http://lore.kernel.org/fd683daa-d953-48ca-8c5d-6f4688ad442c@xxxxxxxxx [2]
Suggested-by: Nikolay Borisov <nik.borisov@xxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
---
include/linux/io.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/io.h b/include/linux/io.h
index 0642c7ee41db..564934f7e70d 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -191,7 +191,12 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
u64 cursor = from;

while (cursor < to) {
- if (!devmem_is_allowed(pfn))
+ /*
+ * Any restricted access is treated as "no access", i.e.
+ * handle devmem_is_allowed() returning "2" to indicate
+ * restricted access.
+ */
+ if (devmem_is_allowed(pfn) != 1)
return 0;
cursor += PAGE_SIZE;
pfn++;