Re: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA

From: Alex Williamson
Date: Mon May 03 2021 - 10:44:41 EST


On Mon, 3 May 2021 13:59:43 +0000
Vikram Sethi <vsethi@xxxxxxxxxx> wrote:

> > From: Mark Kettenis <mark.kettenis@xxxxxxxxx>
> > > From: Marc Zyngier <maz@xxxxxxxxxx>
>
> snip
> > > If, by enumerating the properties of Prefetchable, you can show that
> > > they are a strict superset of Normal_NC, I'm on board. I haven't seen
> > > such an enumeration so far.
> > >
> snip
> > > Right, so we have made a small step in the direction of mapping
> > > "prefetchable" onto "Normal_NC", thanks for that. What about all the
> > > other properties (unaligned accesses, ordering, gathering)?
> >
> Regarding gathering/write combining, that is also allowed to prefetchable per PCI spec

As others have stated, gather/write combining itself is not well
defined.

> From 1.3.2.2 of 5/0 base spec:
> A PCI Express Endpoint requesting memory resources through a BAR must set the BAR's Prefetchable bit unless
> the range contains locations with read side-effects or locations in which the Function does not tolerate write
> merging.

"write merging" This is a very specific thing, per PCI 3.0, 3.2.6:

Byte Merging – occurs when a sequence of individual memory writes
(bytes or words) are merged into a single DWORD.

The semantics suggest quadword support in addition to dword, but don't
require it. Writes to bytes within a dword can be merged, but
duplicate writes cannot.

It seems like an extremely liberal application to suggest that this one
write semantic encompasses full write combining semantics, which itself
is not clearly defined.

> Further 7.5.1.2.1 says " A Function is permitted
> to mark a range as prefetchable if there are no side effects on reads, the Function returns all bytes on reads regardless of
> the byte enables, and host bridges can merge processor writes into this range139 without causing errors"
>
> The "regardless of byte enables" suggests to me that unaligned is OK, as only
> certain byte enables may be set, what do you think?
>
> So to me prefetchable in PCIe spec allows for write combining, read without

Ironically here, the above PCI spec section defining write merging has
separate sections for "combining", "merging", and "collapsing". Only
merging is indicated as a requirement for prefetchable resources.

> sideeffect (prefetch/speculative as long as uncached), and unaligned. Regarding
> ordering I didn't find a statement one way or other in PCIe prefetchable definition, but
> I think that goes beyond what PCIe says or doesn't say anyway since reordering can
> also happen in the CPU, and since driver must be aware of correctness issues in its
> producer/consumer models it will need the right barriers where they are required
> for correctness anyway (required for the driver/userspace to work on host w/ ioremap_wc).

A lot of hand waving here, imo. Thanks,

Alex