Re: [PATCH v7 34/45] kvm: rme: Hide KVM_CAP_READONLY_MEM for realm guests

From: Gavin Shan
Date: Tue Apr 08 2025 - 02:38:21 EST


On 4/8/25 2:34 AM, Steven Price wrote:
On 04/03/2025 11:51, Gavin Shan wrote:
On 2/14/25 2:14 AM, Steven Price wrote:
For protected memory read only isn't supported. While it may be possible
to support read only for unprotected memory, this isn't supported at the
present time.

Signed-off-by: Steven Price <steven.price@xxxxxxx>
---
  arch/arm64/kvm/arm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


It's worthy to explain why KVM_CAP_READONLY_MEM isn't supported and its
negative impact. It's something to be done in the future if I'm correct.

I'll add to the commit message:

Note that this does mean that e.g. ROM (or flash) data cannot be
emulated correctly by the VMM.


Please also to mention this if you agree: At present, there is no exposed
APIs from RMM allowing to specifying stage-2 page-table entry's permission.
read-only regions for ROM and flash have to be backed up by read-write stage-2
page-table entries. It's going to rely on the stage-1 page-table to have the
proper permissions for those read-only regions.

From QEMU's perspective, all ROM data, which is populated by it, can
be written. It conflicts to the natural limit: all ROM data should be
read-only.

Yes this is my understanding of the main impact. I'm not sure how useful
(shared) ROM/flash emulation is. It can certainly be added in the future
if needed. Protected read-only memory I don't believe is useful - the
only sane response I can see from a write fault in that case is killing
the guest.


Yes, VMM is still able to write to those regions even they're read-only
since they're emulated. For misbehaving guest where those regions are also
mapped as read-write, the data resident in those regions can be corrupted
by guest. It's not the expected output.

Since RMM doesn't have exposed APIs allowing to specify page-table entry's
permissions, meaning all entries have read-write permissions, we have to
give read-write permission to those read-only regions for now. In long run,
it's something to be fixed, starting from RMM.

Thanks,
Gavin

Thanks,
Steve

QEMU
====
rom_add_blob
  rom_set_mr
    memory_region_set_readonly
      memory_region_transaction_commit
        kvm_region_commit
          kvm_set_phys_mem
            kvm_mem_flags                                    // flag
KVM_MEM_READONLY is missed
            kvm_set_user_memory_region
              kvm_vm_ioctl(KVM_SET_USER_MEMORY_REGION2)

non-secure host
===============
rec_exit_sync_dabt
  kvm_handle_guest_abort
    user_mem_abort
      __kvm_faultin_pfn                       // writable == true
        realm_map_ipa
          WARN_ON(!(prot & KVM_PGTABLE_PROT_W)

non-secure host
===============
kvm_realm_enable_cap(KVM_CAP_ARM_RME_POPULATE_REALM)
  kvm_populate_realm
    __kvm_faultin_pfn                      // writable == true
      realm_create_protected_data_page

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 1f3674e95f03..0f1d65f87e2b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -348,7 +348,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
long ext)
      case KVM_CAP_ONE_REG:
      case KVM_CAP_ARM_PSCI:
      case KVM_CAP_ARM_PSCI_0_2:
-    case KVM_CAP_READONLY_MEM:
      case KVM_CAP_MP_STATE:
      case KVM_CAP_IMMEDIATE_EXIT:
      case KVM_CAP_VCPU_EVENTS:
@@ -362,6 +361,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
long ext)
      case KVM_CAP_COUNTER_OFFSET:
          r = 1;
          break;
+    case KVM_CAP_READONLY_MEM:
      case KVM_CAP_SET_GUEST_DEBUG:
          r = !kvm_is_realm(kvm);
          break;

Thanks,
Gavin