Re: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA

From: Shanker R Donthineni
Date: Sat May 08 2021 - 12:33:26 EST


Hi Marc,

On 5/5/21 1:02 PM, Catalin Marinas wrote:
>>> Will/Catalin, perhaps you could explain your thought process on why you chose
>>> Normal NC for ioremap_wc on the armv8 linux port instead of Device GRE or other
>>> Device Gxx.
>> I think a combination of: compatibility with 32-bit Arm, the need to
>> support unaligned accesses and the potential for higher performance.
> IIRC the _wc suffix also matches the pgprot_writecombine() used by some
> drivers to map a video framebuffer into user space. Accesses to the
> framebuffer are not guaranteed to be aligned (memset/memcpy don't ensure
> alignment on arm64 and the user doesn't have a memset_io or memcpy_toio).
>
>> Furthermore, ioremap() already gives you a Device memory type, and we're
>> tight on MAIR space.
> We have MT_DEVICE_GRE currently reserved though no in-kernel user, we
> might as well remove it.
@Marc, Could you provide your thoughts/guidance for the next step? The
proposal of getting hints for prefetchable regions from VFIO/QEMU is not
recommended, The only option left is to implement ARM64 dependent logic
in KVM.

Option-1: I think we could take advantage of stage-1/2 combining rules to
allow NORMAL_NC memory-type for device memory in VM. Always map
device memory at stage-2 as NORMAL-NC and trust VM's stage-1 MT.

---------------------------------------------------------------
Stage-2 MT     Stage-1 MT    Resultant MT (combining-rules/FWB)
---------------------------------------------------------------
Normal-NC      Normal-WT           Normal-NC
   -           Normal-WB              -
   -           Normal-NC              -
   -           Device-<attr>       Device-<attr>
---------------------------------------------------------------

We've been using this option internally for testing purpose and validated with
NVME/Mellanox/GPU pass-through devices on Marvell-Thundex2 platform.

Option-2: Get resource properties associated with MMIO using lookup_resource()
and map at stage-2 as Normal-NC if IORESOURCE_PREFETCH is set in flags.