RE: [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets

From: Dan Williams

Date: Tue Mar 17 2026 - 13:04:36 EST


Manish Honap wrote:
[..]
> > The CXL accelerator series is currently contending with being able to
> > restore device configuration after reset. I expect vfio-cxl to build on
> > that, not push CXL flows into the PCI core.
>
> Hello Dan,
>
> My VFIO CXL Type-2 passthrough series [1] takes a position on this that I
> would like to explain because I expect you will have similar concerns about
> it and I'd rather have this conversation now.
>
> Type-2 passthrough series takes the opposite structural approach as you are
> suggesting here: CXL Type-2 support is an optional extension compiled into
> vfio-pci-core (CONFIG_VFIO_CXL_CORE), not a separate driver.
>
> Here is the reasoning:
>
> 1. Device enumeration
> =====================
>
> CXL Type-2 devices (GPU + accelerator class) are enumerated as struct pci_dev
> objects. The kernel discovers them through PCI config space scan, not through
> the CXL bus. The CXL capability is advertised via the DVSEC (PCI_EXT_CAP_ID
> 0x23, Vendor ID 0x1E98), which is PCI config space. There is no CXL bus
> device to bind to.
>
> A standalone vfio-cxl driver would therefore need to match on the PCI device
> just like vfio-pci does, and then call into vfio-pci-core for every PCI
> concern: config space emulation, BAR region handling, MSI/MSI-X, INTx, DMA
> mapping, FLR, and migration callbacks. That is the variant driver pattern
> we rejected in favour of generic CXL passthrough. We have seen this exact

Lore link for this "rejection" discussion?

> outcome with the prior iterations of this series before we moved to the
> enlightened vfio-pci model.

I still do not understand the argument. CXL functionality is a library
that PCI drivers can use. If vfio-pci functionality is also a library
then vfio-cxl is a driver that uses services from both libraries. Where
the module and driver name boundaries are drawn is more an organization
decision not an functional one.

The argument for vfio-cxl organizational independence is more about
being able to tell at a diffstat level the relative PCI vs CXL
maintenance impact / regression risk.

> 2. CXL-CORE involvement
> =======================
>
> CXL type-2 passthrough series does not bypass CXL core. At vfio_pci_probe()
> time the CXL enlightenment layer:
>
> - calls cxl_get_hdm_info() to probe the HDM Decoder Capability block,
> - calls cxl_get_committed_decoder() to locate pre-committed firmware regions,
> - calls cxl_create_region() / cxl_request_dpa() for dynamic allocation,
> - creates a struct cxl_memdev via the CXL core (via cxl_probe_component_regs,
> the same path Alejandro's v23 series uses).
>
> The CXL core is fully involved. The difference is that the binding to
> userspace is still through vfio-pci, which already manages the pci_dev
> lifecycle, reset sequencing, and VFIO region/irq API.

Sure, every CXL driver in the system will do the same.

> 3. Standalone vfio-cxl
> ======================
>
> To match the model you are suggesting, vfio-cxl would need to:
>
> (a) Register a new driver on the CXL bus (struct cxl_driver), probing
> struct cxl_memdev or a new struct cxl_endpoint,

What, why? Just like this patch was series was proposing extending the
PCI core with additional common functionality the proposal is extend the
CXL core object drivers with the same.

> (b) Re-implement or delegate everything vfio-pci-core provides — config
> space, BAR regions, IRQs, DMA, FLR, and VFIO container management —
> either by calling vfio-pci-core as a library or by duplicating it, and

What is the argument against a library?

> (c) present to userspace through a new device model distinct from
> vfio-pci.

CXL is a distinct operational model. What breaks if userspace is
required to explicitly account for CXL passhthrough?

> This is a significant new surface. QEMU's CXL passthrough support already
> builds on vfio-pci: it receives the PCI device via VFIO, reads the
> VFIO_DEVICE_INFO_CAP_CXL capability chain, and exposes the CXL topology.
> A vfio-cxl object model would require non-trivial QEMU changes for something
> that already works in the enlightened vfio-pci model.

What specifically about a kernel code organization choice affects the
QEMU implementation? A uAPI is kernel code organization agnostic.

The concern is designing ourselves into a PCI corner when longterm QEMU
benefits from understanding CXL objects. For example, CXL error handling
/ recovery is already well on its way to being performed in terms of CXL
port objects.

> 4. Module dependency
> ====================
>
> Current solution: CONFIG_VFIO_CXL_CORE depends on CONFIG_CXL_BUS. We do not
> add CXL knowledge to the PCI core;

drivers/pci/cxl.c

> we add it to the VFIO layer that is already CXL_BUS-dependent.

Yes, VFIO layer needs CXL enlightenment and VFIO's requirements imply
wider benefits to other CXL capable devices.

> I would very much appreciate your thoughts on [1] considering the above. I want
> to understand your thoughts on whether vfio-pci-core can remain the single
> entry point from userspace, or whether you envision a new VFIO device type.
>
> Jonathan has indicated he has thoughts on this as well; hopefully, we
> can converge on a direction that doesn't require duplicating vfio-pci-core.

No one is suggesting, "require duplicating vfio-pci-core", please do not
argue with strawman cariacatures like this.

> [1] https://lore.kernel.org/linux-cxl/20260311203440.752648-1-mhonap@xxxxxxxxxx/

Will take a look...