Re: [PATCH v14 29/44] arm64: RMI: Runtime faulting of memory

From: Gavin Shan

Date: Fri Jun 26 2026 - 03:44:01 EST

On 6/26/26 1:58 AM, Suzuki K Poulose wrote:

On 25/06/2026 14:53, Gavin Shan wrote:

On 6/6/26 12:35 AM, Lorenzo Pieralisi wrote:

On Fri, Jun 05, 2026 at 06:11:11PM +1000, Gavin Shan wrote:

On 6/5/26 5:28 PM, Lorenzo Pieralisi wrote:

On Fri, Jun 05, 2026 at 04:23:15PM +1000, Gavin Shan wrote:

[...]

I tried to rebase Jean's latest QEMU series [1] to upstream QEMU, and found
that memory slots backed by THP are broken. With THP disabled on the host and
other fixes (mentioned in my prevous replies) applied on the top of this (v14)
series, I'm able to boot a realm guest with rebased QEMU series [2], plus more
fxies on the top.

[1] https://git.codelinaro.org/linaro/dcap/qemu.git ; (branch: cca/ latest)
[2] https://git.qemu.org/git/qemu.git               ; (branch: cca/gavin)

Lorenzo, You may be saying there is someone making QEMU to support ARM/CCA?

Mathieu and I are working on that yes and with Steven/Suzuki to fix the THP
issues you pointed out above.

If so, I'm not sure if there is a QEMU repository for me to try?

We should be able to submit patches by end of June - we shall let you know
whether we can make something available earlier.

Not sure if there are other known issues in this series. It seems the stage2
page fault handling on the shared space isn't working well. In my test, the
vring (struct vring_desc) of virtio-net-pci is updated by the guest, and the
data isn't seen by QEMU, I'm suspecting if the host-page-frame-number is properly
resolved in the s2 page fault handler for shared (unprotected) space.

- I rebased Jean's latest qemu branch to the upstream qemu;

- On the host, which is emulated by qemu/tcg, the THP (transparent huge page) is
   disabled.

- On the guest, I can see the virtio vring (struct vring_desc) is updated. The
   S1 page-table entry looks correct because the corresponding physical address
   0x10046880000 is a sane shared (unprotected) space address.

   [   52.094143] software IO TLB: Memory encryption is active and system is using DMA bounce buffers
   [   52.289746] virtqueue_add_desc_split: desc[0]@0xffff000006880000, [00000100b983f000 00000640 0002 0001]
   [   52.432150] PTE 0x00e8010046880707 at address 0xffff000006880000

- On the host, the s2 page-table-entry is unmapped due to attribute transition (private -> shared).
   A subsequent S2 page fault is raised against the adress and the s2 page-table-entry is built.

   [ 109.259077] ====> realm_unmap_shared_range: tracked_unprot_addr=0x10046880000
   [ 109.260249] realm_unmap_shared_range: unmapped shared range at 0x10046880000
   [ 109.317786] realm_unmap_shared_range: unmapped shared range at 0x10046880000
   [ 109.629939] ====> kvm_handle_guest_abort: fault_ipa=0x10046880000, esr=0x92000007
   [ 109.630245] realm_map_non_secure: ipa=0x10046880000, pfn=0xb8b59, size=0x1000, prot=0xf
   [ 109.630331] realm_map_non_secure: ipa=0x10046880000, ipa_top=0x10046881000, flags=0x1e0001, range_desc=0xb8b59004

Are you able to correlate the order of the transitions and the Guest
access with RMM log ? We haven't seen this from our end. We are aware
of permission fault issues with Unprotected IPA when backing the memslot
with MAP_PRIVATE areas. But this looks different.

Lorenzo, have you run into this ?

It's hard to correlate the order since the logs are collected from two separate
consoles. For the write permission, I add code to the host where the permission
is always added for all s2 page faults in the shared space. Otherwise, qemu can
be killed by -EFAULT or similar error.

There are more findings after more experiments: this virtio-net-pci device has 3
queues or vrings (Rx/Tx/Ctrl). The Rx/Tx/Ctrl queue are populated in order one after
one. In the guest kernel, I intentionally write fixed data (0x0123456789abcdef) to
the first 8 bytes of the queue when it gets populated, and stop the guest at random
points to see if the data is gone. I found that the data written to Rx/Tx queue are
lost after Ctrl queue is allocated.

The data written to Rx/Tx queue is lost if the guest stops (B). The data written to
Rx/Tx queue isn't lost if the guest stops at (A). I can see the pattern (0x0123...cdef)
by dumping the physcial memory through 'pmemsave' command in qemu.

DMA allocation
==============
dma_alloc_coherent
dma_alloc_attrs
dma_direct_alloc
__dma_direct_alloc_pages
dma_set_decrypted // (A) No data lost if being stopped here for the Ctrl queue
memset(ret, 0, size) // (B) Data lost after being stopped after memset() for the Ctrl queue

The memset() on the Ctrl queue should trigger a stage2 page fault. It seems the page
fault enforces the shared pages for Rx/Tx queue to be dropped? I need to add more
debugging code and track it down.

Suzuki

- On QEMU, the updated vring (struct vring_desc) at GPA 0x46880000 isn't seen. All the
   data in that adress are zeros.

   ====> virtqueue_split_pop: vdev=<virtio-net>, sz=0x38, queue_index=0x0, vq->vring.num=0x100
   virtqueue_split_pop: last_avail_idx=0x0, head=0x0
   address_space_read_cached_slow: cache@0xffff1c036440, addr=0x0, buf=0xffffeee34880, len=0x10
   address_space_read_cached_slow: cache: ptr=0x0, xlat=0x10046880000, len=0x1000, mrs=<realm-dma-region>, is_write=no
   address_space_read_cached_slow: translated to mr=<mach-virt.ram>, mr_addr=0x6880000, l=0x10
   flatview_read_continue_step: mr=<mach-virt.ram>, host=0xffff23e00000, mr_addr=0x6880000, ram_ptr=0xffff2a680000
   virtqueue_split_pop: desc: 0000000000000000 - 00000000 - 00000000 - 00000000
   qemu-system-aarch64: virtio: zero sized buffers are not allowed

Thanks,
Gavin