Re: [PATCH v2 0/4] cxl: Consolidate cxlmd->endpoint accessing

From: Dave Jiang

Date: Mon Mar 16 2026 - 13:57:35 EST




On 3/14/26 12:06 AM, Li Ming wrote:
> Currently, CXL subsystem implementation has some functions that may
> access CXL memdev's endpoint before the endpoint initialization
> completed or without checking the CXL memdev endpoint validity.
> This patchset fixes three scenarios as above description.
>
> 1. cxl_dpa_to_region() is possible to access an invalid CXL memdev
> endpoint.
> there are two scenarios that can trigger this issue:
> a. memdev poison injection/clearing debugfs interfaces:
> devm_cxl_add_endpoint() is used to register CXL memdev endpoint
> and update cxlmd->endpoint from -ENXIO to the endpoint structure.
> memdev poison injection/clearing debugfs interfaces are registered
> before devm_cxl_add_endpoint() is invoked in cxl_mem_probe().
> There is a small window where user can use the debugfs interfaces
> to access an invalid endpoint.
> b. cxl_event_config() in the end of cxl_pci_probe():
> cxl_event_config() invokes cxl_mem_get_event_record() to get
> remain event logs from CXL device during cxl_pci_probe(). If CXL
> memdev probing failed before that, it is also possible to access
> an invalid endpoint.
> To fix these two cases, cxl_dpa_to_region() requires callers holding
> CXL memdev lock to access it and check if CXL memdev driver bingding
> status. Holding CXL memdev lock ensures that CXL memdev probing has
> completed, and if CXL memdev driver is bound, it will mean
> cxlmd->endpoint is valid. (PATCH #1-#3)
>
> 2. cxl_reset_done() callback in cxl_pci module.
> cxl_reset_done() callback also accesses cxlmd->endpoint without any
> checking. If CXL memdev probing fails, then cxl_reset_done() is
> called by PCI subsystem, it will access an invalid endpoint. The
> solution is adding a CXL memdev driver binding status inside
> cxl_reset_done(). (PATCH #4)
>
> ---
> Changes in v2:
> - Move hoding CXL memdev lock to cxl_debugfs_poison_inject/clear(). (Alison)
> - Drop device_lock_assert() in cxl_inject/clear_poison_locked(). (Alison)
> - Remove device_lock_assert() in cxl_dpa_to_region() to remove patch
> "cxl/region: Hold memdev lock during region poison injection/clear". (Alison)
> - Squash patch "cxl/pci: Hold memdev lock in cxl_event_trace_record()"
> and patch "cxl/region: Ensure endpoint is valid in cxl_dpa_to_region()". (Dan & Dave)
> - Remove patch "cxl/port: Reset cxlmd->endpoint to -ENXIO by default".
> - Link to v1: https://lore.kernel.org/r/20260310-fix_access_endpoint_without_drv_check-v1-0-94fe919a0b87@xxxxxxxxxxxx
>
> ---
> Li Ming (4):
> driver core: Add conditional guard support for device_lock()
> cxl/memdev: Hold memdev lock during memdev poison injection/clear
> cxl/pci: Hold memdev lock in cxl_event_trace_record()
> cxl/pci: Check memdev driver binding status in cxl_reset_done()
>
> drivers/cxl/core/mbox.c | 5 +++--
> drivers/cxl/core/region.c | 8 +++++---
> drivers/cxl/cxlmem.h | 2 +-
> drivers/cxl/mem.c | 10 ++++++++++
> drivers/cxl/pci.c | 3 +++
> include/linux/device.h | 1 +
> 6 files changed, 23 insertions(+), 6 deletions(-)
> ---
> base-commit: 11439c4635edd669ae435eec308f4ab8a0804808
> change-id: 20260308-fix_access_endpoint_without_drv_check-f2e6ff4bdc48

Applied to cxl/next
43e4c205197e cxl/pci: Check memdev driver binding status in cxl_reset_done()
11ce2524b7f3 cxl/pci: Hold memdev lock in cxl_event_trace_record()
b227d1faed0a cxl/memdev: Hold memdev lock during memdev poison injection/clear
e5564e392075 Merge tag 'device_lock_cond_guard-7.1-rc1' into for-7.1/cxl-consolidate-endpoint

>
> Best regards,