Re: [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets
From: Dan Williams
Date: Thu Apr 02 2026 - 17:56:11 EST
Alex Williamson wrote:
[..]
> > Now a follow on concern is the plan to manage a case of "PCI operation
> > is available, but CXL operation is not. Does the driver proceed?" Put
> > another way, I immediately see how to convey the policy of "continue
> > without CXL" when there is an explicit driver distinction, but it is
> > ambiguous with an enlightened vfio-pci driver.
>
> As an enlightenment to vfio-pci, CXL support must in all cases degrade
> to PCI support. Manish's series proposes a new flag bit in the
> DEVICE_INFO ioctl for CXL (type2 specifically) that would be used in
> combination with the existing PCI flag. If both are set, it's a PCI
> device with CXL.{mem,cache} capability, otherwise only PCI would be set.
Ok.
>
> > > > If vfio-pci functionality is also a library
> > > > then vfio-cxl is a driver that uses services from both libraries. Where
> > > > the module and driver name boundaries are drawn is more an organization
> > > > decision not an functional one.
> > >
> > > But as above, it is functional. Someone needs to define when to use
> > > which driver, which leads to libvirt needing to specify whether a
> > > device is being exposed as PCI or CXL, and the same understanding in
> > > each VMM. OTOH, using vfio-pci as the basis and layering CXL feature
> > > detection, ie. enlightenment, gives us a more compatible, incremental
> > > approach.
> >
> > Ok, to make sure I understand the proposal: userspace still needs to to
> > end up with knowledge of CXL operation, but that need not be resolved by
> > module policy.
>
> It's a single module as far as userspace is concerned, and the decision
> lies with userspace whether to take advantage of the CXL features
> indicated by the device flag.
>
> > Userspace also just needs to be ok with the unsightliness of the CXL
> > modules autoloading on systems without CXL.
>
> I'm open to suggestions here. The current proposal will pull in CXL
> modules regardless of having a CXL device.
>
> We could build vfio_cxl_core as a module with an automatic
> MODULE_SOFTDEP in vfio_pci_core. We could then do a symbol_get around
> CXL code so that we never CXL enlighten a device if the module isn't
> loaded, allowing userspace policy control via modprobe.d blacklists.
> We could also use a registration mechanism from vfio-cxl-core to
> vfio-pci-core to avoid symbol_gets.
Probably just wait until someone really needs that small bit of memory
back.
> > Implement features like CXL Reset as operations against CXL objects like
> > memdevs and regions. For example, PCI reset does not consider management
> > of cache coherent memory, and certainly not interleaved cache coherent
> > memory. Other CXL drivers also benefit if these capabilities are
> > centralized.
>
> I think "CXL Reset as operations against CXL objects" is large already
> proposed as [1]. However, it's specifically for type2 devices, so we
> can ignore some of the complications, such as interleaved cache
> coherence, of a type3 use case.
Nothing type3 specific about interleaved cache coherence. Now,
interleaving for accelerators is not in near term scope, but CXL.mem is
coherent. I do not want to paint the design into a corner in case
host-bridge interleaving for bandwidth becomes a consideration.
> [1]https://lore.kernel.org/all/20260306092322.148765-1-smadhavan@xxxxxxxxxx/
Still playing catch up on review, but yes, that version looks
directionally ok and at least has a chance to be extended for
interleaving.
[..]
> > > This is largely a consequence of CXL_BUS being a loadable module.
> >
> > Yes, the question is why does that matter for CXL enlightened operation?
> > Simply do not burden the PCI core to learn all the CXL concerns.
>
> How do we then proceed relative to save/restore of CXL state based on a
> PCI reset? Should CXL core register a save/restore handler with PCI
> core or does PCI core reach out for a symbol from CXL core to support
> save/restore?
I am currently thinking CXL registered handler to enable enlightened
reset, or direct CXL uAPI.
Is it not safe to assume that new CXL awareness in user tooling is
prepared move to new CXL aware reset interfaces?
> If CXL core is not loaded, are we ok with silently losing CXL state
> across a PCI reset, ie. assume that state is unused currently and accept
> the risk of losing preconfigured decoders?
> Does PCI core need to be involved in suppressing SBR?
SBR is disabled by default. It can currently be destructively forced
with the PCI "cxl_bus" reset type.
Maybe what this wants is a new "SBR iff single device CXL.mem and mem
not kernel mapped", to set aside the "but coherent interleave" noise.
Just not sure it is worth introducing the concept of "device rejectable
PCI reset" vs requiring using CXL uAPI directly.