Re: [RFC PATCH] x86/sev: Disallow userspace access to BIOS region for SEV-SNP guests

From: Nikolay Borisov
Date: Thu Apr 10 2025 - 08:04:25 EST




On 9.04.25 г. 21:39 ч., Dan Williams wrote:
Kees Cook wrote:
On Tue, Apr 08, 2025 at 04:55:08PM -0700, Dan Williams wrote:
Dave Hansen wrote:
On 4/8/25 06:43, Tom Lendacky wrote:
Tom/Boris, do you see a problem blocking access to /dev/mem for SEV
guests?
Not sure why we would suddenly not allow that.

Both TDX and SEV-SNP have issues with allowing access to /dev/mem.
Disallowing access to the individually troublesome regions can fix
_part_ of the problem. But suddenly blocking access is guaranteed to fix
*ALL* the problems forever.

...or at least solicits practical use cases for why the kernel needs to
poke holes in the policy.

Or, maybe we just start returning 0's for all reads and throw away all
writes. That is probably less likely to break userspace that doesn't
know what it's doing in the first place.

Yes, and a bulk of the regression risk has already been pipe-cleaned by
KERNEL_LOCKDOWN that shuts down /dev/mem and PCI resource file mmap in
many scenarios.

Here is an updated patch that includes some consideration for mapping
zeros for known legacy compatibility use cases.
[..]
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index bfa444a7dbb0..c8679ae1bc8b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -867,6 +867,8 @@ void __init poking_init(void)
*/
int devmem_is_allowed(unsigned long pagenr)
{
+ bool platform_allowed = x86_platform.devmem_is_allowed(pagenr);
+
if (region_intersects(PFN_PHYS(pagenr), PAGE_SIZE,
IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE)
!= REGION_DISJOINT) {
@@ -885,14 +887,20 @@ int devmem_is_allowed(unsigned long pagenr)
* restricted resource under CONFIG_STRICT_DEVMEM.
*/
if (iomem_is_exclusive(pagenr << PAGE_SHIFT)) {
- /* Low 1MB bypasses iomem restrictions. */
- if (pagenr < 256)
+ /*
+ * Low 1MB bypasses iomem restrictions unless the
+ * platform says "no", in which case map zeroes
+ */
+ if (pagenr < 256) {
+ if (!platform_allowed)
+ return 2;
return 1;
+ }
return 0;
}
- return 1;
+ return platform_allowed;
}
void free_init_pages(const char *what, unsigned long begin, unsigned long end)

I am reminded of this discussion:
https://lore.kernel.org/all/CAPcyv4iVt=peUAk1qx_EfKn7aGJM=XwRUpJftBhkUgQEti2bJA@xxxxxxxxxxxxxx/

As in, mmap will bypass this restriction, so if you really want the low
1MiB to be unreadable, a solution for mmap is still needed...

Glad you remembered that!

This needs a self-test to verify the assumptions here. I can circle back
next week or so take a look at turning this into a bigger series. If
someone has cycles to take this on before that I would not say no to
some help.


Can't we simply treat return value of 2 for range_is_allowed the same way as if 0 was returned in mmap_mem and simply fail the call with -EPERM?