Re: [RFC PATCH] vfio/pci: map prefetchble bars as writecombine

From: Alex Williamson
Date: Fri Jul 20 2018 - 16:27:35 EST


On Thu, 19 Jul 2018 21:49:48 +0530
Srinath Mannam <srinath.mannam@xxxxxxxxxxxx> wrote:

> HI Alex,
>
> On Thu, Jul 19, 2018 at 8:42 PM, Alex Williamson
> <alex.williamson@xxxxxxxxxx> wrote:
> > On Thu, 19 Jul 2018 20:17:11 +0530
> > Srinath Mannam <srinath.mannam@xxxxxxxxxxxx> wrote:
> >
> >> HI Alex,
> >>
> >> On Thu, Jul 19, 2018 at 2:55 AM, Alex Williamson
> >> <alex.williamson@xxxxxxxxxx> wrote:
> >> > On Thu, 19 Jul 2018 00:05:18 +0530
> >> > Srinath Mannam <srinath.mannam@xxxxxxxxxxxx> wrote:
> >> >
> >> >> Hi Alex,
> >> >>
> >> >> On Tue, Jul 17, 2018 at 8:52 PM, Alex Williamson
> >> >> <alex.williamson@xxxxxxxxxx> wrote:
> >> >> > On Fri, 13 Jul 2018 10:26:17 +0530
> >> >> > Srinath Mannam <srinath.mannam@xxxxxxxxxxxx> wrote:
> >> >> >
> >> >> >> By default all BARs map with VMA access permissions
> >> >> >> as pgprot_noncached.
> >> >> >>
> >> >> >> In ARM64 pgprot_noncached is MT_DEVICE_nGnRnE which
> >> >> >> is strongly ordered and allows aligned access.
> >> >> >> This type of mapping works for NON-PREFETCHABLE bars
> >> >> >> containing EP controller registers.
> >> >> >> But it restricts PREFETCHABLE bars from doing
> >> >> >> unaligned access.
> >> >> >>
> >> >> >> In CMB NVMe drives PREFETCHABLE bars are required to
> >> >> >> map as MT_NORMAL_NC to do unaligned access.
> >> >> >>
> >> >> >> Signed-off-by: Srinath Mannam <srinath.mannam@xxxxxxxxxxxx>
> >> >> >> Reviewed-by: Ray Jui <ray.jui@xxxxxxxxxxxx>
> >> >> >> Reviewed-by: Vikram Prakash <vikram.prakash@xxxxxxxxxxxx>
> >> >> >> ---
> >> >> >
> >> >> > This has been discussed before:
> >> >> >
> >> >> > https://www.spinics.net/lists/kvm/msg156548.html
> >> >> Thank you for inputs.. I have gone through the long list of mail chain
> >> >> discussion.
> >> >> >
> >> >> > CC'ing the usual suspects from the previous thread. I'm not convinced
> >> >> > that the patch here has considered anything other than the ARM64
> >> >> > implications and it's not clear that it considers compatibility with
> >> >> > existing users or devices at all. Can we guarantee for all devices and
> >> >> > use cases that WC is semantically equivalent and preferable to UC? If
> >> >> > not then we need to device an extension to the interface that allows
> >> >> > the user to specify WC. Thanks,
> >> >> >
> >> >> To implement with user specified WC flags, many changes need to be done.
> >> >> Suppose In DPDK, prefetcable BARs map using WC flag, then also same
> >> >> question comes
> >> >> that WC may be different for different CPUs.
> >> >> As per functionality, both WC and PREFETCHABLE are same, like merging writes and
> >> >> typically WC is uncached.
> >> >> So, based on prefetchable BARs behavior and usage we need to map bar memory.
> >> >> Is it right to map prefetchable BARs as strongly ordered, aligned
> >> >> access and uncached?
> >> >
> >> > Is it possible to answer that question generically? Whether to map a
> >> > BAR as UC or WC is generally a question for the driver. Does the
> >> > device handle unaligned accesses? Does the device need strong memory
> >> > ordering? If this is a driver level question then the driver that
> >> > needs to make that decision is the userspace driver. VFIO is just a
> >> > pass-through here and since we don't offer the user a choice of
> >> > mappings, we take the safer and more conservative mapping, ie. UC.
> >> >
> >> Yes, you are right, driver should make the decision based on its requirement.
> >> In my case, user space driver is part of SPDK, so SPDK should request DPDK
> >> and DPDK should request VFIO to map BAR for its choice of mapping.
> >> So to implement this we need code changes in VFIO, DPDK and SPDK.
> >>
> >> > You're suggesting that there are many changes to be done if we modify
> >> > the vfio interface to expose WC under the user's control rather than
> >> > simply transparently impose WC for all mappings, but is that really the
> >> > case? Most devices on most platforms seem to work fine now. Perhaps WC
> >> > is a performance optimization, but this is the first instance I've seen
> >> > of it as a functional issue. Does that suggest that the imposed
> >> > alignment on your platform is perhaps unique and the relaxed alignment
> >> > should be implemented at the architecture specific memory flags for UC
> >> > mappings? For instance, does x86 require this change for the same
> >> > device? The chance for regressions of other devices on other platforms
> >> > seems rather high as proposed. Thanks,
> >> This issue is not specific to platform or device. this is the requirement of
> >> CMB enabled NVMe cards.
> >> NVMe kernel driver already has support to map CMB bar as WC using
> >> ioremap_wc function.
> >> File: drivers/nvme/host/pci.c
> >> Function: nvme_map_cmb
> >> code: dev->cmb = ioremap_wc(pci_resource_start(pdev, bar) + offset, size);
> >> It means ioremap_wc is working with all platforms and WC map of
> >> perfetchable BARs does not harm.
> >> Same is required in SPDK NVMe driver also, without WC map it may work
> >> in x86 platform, but it does not work in ARM platforms.
> >
> > Doesn't this contradict your assertion that it's not specific to
> > platform or device? The device requires support for unaligned
> > accesses. The platform chooses to restrict unaligned accesses for
> > non-WC mappings while other platforms do not. The native driver can
> > still clearly have performance considerations for choosing to use WC
> > mappings, but it's still not clear to me that the functionality issue
> > isn't self inflicted by the platform definition of UC vs WC. Thanks,
> >
> Device allows both aligned and unaligned access.. so software have
> flexibility can do unaligned access.
> As per ARM64 platform, with UC map, memory access should be un-cached,
> aligned access and strongly order mapping.
> with WC map, memory access can be aligned/unaligned and un-cached. I
> thought this is the property of platform not issue.
> To allow software to do unaligned access of device memory, we need to
> use WC map of ARM64 platform case.
> In ARM platforms UC mapping is used to map controller registers which
> are 4 byte aligned exposed by non-prefetchable bars.
> Also prefetchable BARs allows write merging so I thought using WC map
> fulfills both write merging (add performance) and unaligned
> access.
> Can I modify the code as below to enable only for ARM platforms.
> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> + if (pci_resource_flags(pdev, index) & IORESOURCE_PREFETCH)
> + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
> +#endif
> vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff;

While the risk of regression is smaller by restricting this to ARM, I
don't think it's the right solution. What happens when a device
requires strict ordering? ARM now behaves differently than any other
architecture, that's not acceptable. Thanks,

Alex