Re: [RFC PATCH] /dev/mem: Disable /dev/mem under TDX guest

From: Nikolay Borisov
Date: Tue Mar 18 2025 - 08:53:50 EST




On 18.03.25 г. 14:23 ч., Kirill A. Shutemov wrote:
On Tue, Mar 18, 2025 at 01:36:04PM +0200, Nikolay Borisov wrote:
If a piece of memory is read from /dev/mem that falls outside of the
System Ram region i.e bios data region the kernel creates a shared
mapping via xlate_dev_mem_ptr() (this behavior was introduced by
9aa6ea69852c ("x86/tdx: Make pages shared in ioremap()"). This results
in a region having both a shared and a private mapping.

Subsequent accesses to this region via the private mapping induce a
SEPT violation and a crash of the VMM.

Crash of VMM or TD termination? If VMM crashes in this case, it has to be
fixed.

Went back through the bug reports and it seems this causes a SEPT violation inside the guest, which crashes, and is then re-created by GCP. So it would seem this causes an SEPT violation, rather than a VMM crash, my bad for mixing up the symptoms.


In this particular case the
scenario was a userspace process reading something from the bios data
area at address 0x497 which creates a shared mapping, and a followup
reboot accessing __va(0x472) which access pfn 0 via the private mapping
causing mayhem.

I think it should lead to unrecoverable EPT-violation, but not VMM crash.

<nod> You are correct.


Fix this by simply forbidding access to /dev/mem when running as an TDX
guest.

I think we need to think wider. What about applying a subset of LOCKDOWN_*
in all coco guests by default. Many of them are relevant for the guest security.

How do you envision this to work, by introducing another CONFIG_LOCK_DOWN_KERNEL_FORCE_COCO or some such ? Will it be opt-in or mandatory?

Should we decide to follow the lockdown route this means the owner of the coco guest will have the ability to disable it and a misbehaving userspace process will still be able to induce an EPT violation.