Re: [RESEND v13 17/25] cxl: Introduce cxl_pci_drv_bound() to check for bound driver

From: Gregory Price

Date: Wed Nov 05 2025 - 14:03:36 EST


On Wed, Nov 05, 2025 at 12:51:04PM -0500, Gregory Price wrote:
> On Tue, Nov 04, 2025 at 11:02:57AM -0600, Terry Bowman wrote:
> > CXL devices handle protocol errors via driver-specific callbacks rather
> > than the generic pci_driver::err_handlers by default. The callbacks are
> > implemented in the cxl_pci driver and are not part of struct pci_driver, so
> > cxl_core must verify that a device is actually bound to the cxl_pci
> > module's driver before invoking the callbacks (the device could be bound
> > to another driver, e.g. VFIO).
> >
> > However, cxl_core can not reference symbols in the cxl_pci module because
> > it creates a circular dependency. This prevents cxl_core from checking the
> > EP's bound driver and calling the callbacks.
> >
> > To fix this, move drivers/cxl/pci.c into drivers/cxl/core/pci_drv.c and
> > build it as part of the cxl_core module. Compile into cxl_core using
> > CXL_PCI and CXL_CORE Kconfig dependencies. This removes the standalone
> > cxl_pci module, consolidates the cxl_pci driver code into cxl_core, and
> > eliminates the circular dependency so cxl_core can safely perform
> > bound-driver checks and invoke the CXL PCI callbacks.
> >
> > Introduce cxl_pci_drv_bound() to return boolean depending on if the PCI EP
> > parameter is bound to a CXL driver instance. This will be used in future
> > patch when dequeuing work from the kfifo.
> >
> > Signed-off-by: Terry Bowman <terry.bowman@xxxxxxx>
> > Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> > Reviewed-by: Ben Cheatham <benjamin.cheatham@xxxxxxx>
> > Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>
> >
> > ---
>
> This commit causes my QEMU basic expander setup and a real device setup
> to fail to probe the cxl_core driver.
>
> [ 2.697094] cxl_core 0000:0d:00.0: BAR 0 [mem 0xfe800000-0xfe80ffff 64bit]: not claimed; can't enable device
> [ 2.697098] cxl_core 0000:0d:00.0: probe with driver cxl_core failed with error -22
>
> Probe order issue when CXL drivers are built-in maybe?
>

I've narrowed it down to:

Works
-----
CONFIG_CXL_BUS=m
CONFIG_CXL_MEM=m

Fails
-----
CONFIG_CXL_BUS=y
CONFIG_CXL_MEM=y
or BUS ^ MEM

this commit moves pci -> pci_drv.o and moves it ahead of cxl_mem into
cxl_core into core, but note the comment in the Makefile:

# Order is important here for the built-in case:
# - 'core' first for fundamental init
# - 'port' before platform root drivers like 'acpi' so that CXL-root ports
# are immediately enabled
# - 'mem' and 'pmem' before endpoint drivers so that memdevs are
# immediately enabled
# - 'pci' last, also mirrors the hardware enumeration hierarchy

~Gregory