Re: [PATCH 7/7] cxl/port: Reset cxlmd->endpoint to -ENXIO by default
From: Dan Williams
Date: Tue Mar 10 2026 - 15:30:46 EST
Li Ming wrote:
> cxlmd->endpoint is set to -ENXIO by default, This intentional invalid
> value ensures that any unintended access to the endpoint will be
> detectable. But CXL driver didn't reset it to the default value when the
> endpoint is released in delete_endpoint().
>
> Besides, cxlmd->endpoint is updated to point to an valid endpoint in
> cxl_port_add(), but if the device_add() in cxl_port_add() fails, the
> endpoint will be released, but cxlmd->endpoint remains pointing to the
> released endpoint, it may introduce a potential use-after-free issue.
>
> Fixes: 3d8be8b398e3 ("cxl: Set cxlmd->endpoint before adding port device")
> Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
So I disagree that this is a fix. It may be a consistency change to make
sure that all invalid ->endpoint accesses have the same result, but it
is not a fix for any discernable end user problem.
> Signed-off-by: Li Ming <ming.li@xxxxxxxxxxxx>
> ---
> drivers/cxl/core/port.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 0c5957d1d329..ec3cb62b44b7 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -834,11 +834,14 @@ static int cxl_port_add(struct cxl_port *port,
> struct cxl_dport *parent_dport)
> {
> struct device *dev __free(put_device) = &port->dev;
> + struct cxl_memdev *cxlmd = NULL;
> int rc;
>
> if (is_cxl_memdev(port->uport_dev)) {
> - struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
> - struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_dev_state *cxlds;
> +
> + cxlmd = to_cxl_memdev(port->uport_dev);
> + cxlds = cxlmd->cxlds;
>
> rc = dev_set_name(dev, "endpoint%d", port->id);
> if (rc)
> @@ -865,8 +868,11 @@ static int cxl_port_add(struct cxl_port *port,
> }
>
> rc = device_add(dev);
> - if (rc)
> + if (rc) {
> + if (cxlmd)
> + cxlmd->endpoint = ERR_PTR(-ENXIO);
I see no need for this. All of the upper levels paths ensure that all
the results of setup are torn down on failure. Consider that if
device_add() succeeds and any of the follow on functions in
__devm_cxl_add_port() then this cleanup will also not happen.
So you must rely on upper level cleanup. This is similar to the
semantics of dev_set_drvdata(). That is only reset up after _probe()
finishes failing.
Now, if you want to do some separate cleanups in this space I would redo
the anti-patterns of scope-based cleanup in cxl_port_alloc() and
cxl_port_add(). cxl_port_alloc() does the "__free = NULL" anti-pattern.
It appears to be doing that to avoid adding a DEFINE_FREE() for ida
cleanup if the port allocation fails.
Also, if the cxl_port_alloc() call became something like:
struct cxl_port *port __free(put_cxl_port) = cxl_port_alloc(...)
...then I do not think cxl_port_add() would need to do the odd
__free(put_device()) initialization without a corresponding
get_device(). Nor the odd "dev = NULL" at the end.
> return rc;
> + }
>
> /* Inhibit the cleanup function invoked */
> dev = NULL;
> @@ -1425,7 +1431,7 @@ static void delete_endpoint(void *data)
> devm_release_action(host, cxl_unlink_uport, endpoint);
> devm_release_action(host, unregister_port, endpoint);
> }
> - cxlmd->endpoint = NULL;
> + cxlmd->endpoint = ERR_PTR(-ENXIO);
I also see no need for this. Either dereferencing NULL or ERR_PTR() will
result in what is needed, a kernel crash. How the use after-free crash
is signalled does not really matter. The only expectation that other
code has is that "->endpoint != NULL" does not mean ->endpoint is valid.
Nothing should care about that though because cxlmd->dev.driver is the
authoritative gate for ->endpoint validity.