[PATCH v6] cxl/region: Fix wrong logic for finding a free switch cxl decoder

From: Zijun Hu
Date: Wed Oct 16 2024 - 18:00:36 EST


From: Zijun Hu <quic_zijuhu@xxxxxxxxxxx>

Provided that all child switch cxl decoders are sorted by ID in ascending
order, then it is wrong for current match_free_decoder()'s logic to find
a free cxl decoder as explained below:

Port
├── cxld A <----> region A
├── cxld B // no region
├── cxld C <----> region C

Current logic will find cxld B as a free one, but cxld B is not true
free since region C has not been torn down, so current logic is wrong.

Fixed by verifying if cxl decoder with ID (@port->commit_end + 1) can
be returned as finding result.

Link: https://lore.kernel.org/all/670af54931b8_964fe29427@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/
Suggested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
Signed-off-by: Zijun Hu <quic_zijuhu@xxxxxxxxxxx>
---
This patch is incremental to below patch series with title "cxl: Initialization and shutdown fixes"
http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx

This patch is based on Dan's recommendation shown by below link:
https://lore.kernel.org/all/670af54931b8_964fe29427@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/
---
Changes in v6:
- Use Dan's recommendation that validate clxd with ID (@port->commit_end + 1)
- Correct commit message and update Link tag this commit derive from
- Link to v5: https://lore.kernel.org/r/20240917-const_dfc_prepare-v5-1-0e20f673ee0c@xxxxxxxxxxx

Changes in v5:
- Use changes suggested by Dan Williams for cxl/region patch
- Correct title and commit message for cxl/region change
- Seperate qcom/emac change for patch series
- Link to v4: https://lore.kernel.org/r/20240905-const_dfc_prepare-v4-0-4180e1d5a244@xxxxxxxxxxx

Changes in v4:
- Drop driver core patch
- Correct commit message for cxl/region patch
- Correct title and commit message for qcom/emac patch
- Link to v3: https://lore.kernel.org/r/20240824-const_dfc_prepare-v3-0-32127ea32bba@xxxxxxxxxxx

Changes in v3:
- Git rebase
- Correct commit message for the driver core patch
- Use changes suggested by Ira Weiny cxl/region
- Drop firewire core patch
- Make qcom/emac follow cxl/region solution suggested by Greg
- Link to v2: https://lore.kernel.org/r/20240815-const_dfc_prepare-v2-0-8316b87b8ff9@xxxxxxxxxxx

Changes in v2:
- Give up introducing the API constify_device_find_child_helper()
- Correct commit message and inline comments
- Implement a driver specific and equivalent one instead of device_find_child()
- Link to v1: https://lore.kernel.org/r/20240811-const_dfc_prepare-v1-0-d67cc416b3d3@xxxxxxxxxxx
---
drivers/cxl/core/region.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e701e4b04032..a7a82da39393 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -796,8 +796,9 @@ static size_t show_targetN(struct cxl_region *cxlr, char *buf, int pos)

static int match_free_decoder(struct device *dev, void *data)
{
+ struct cxl_port *port = to_cxl_port(dev->parent);
struct cxl_decoder *cxld;
- int *id = data;
+ struct device **target_dev = data;

if (!is_switch_decoder(dev))
return 0;
@@ -805,15 +806,24 @@ static int match_free_decoder(struct device *dev, void *data)
cxld = to_cxl_decoder(dev);

/* enforce ordered allocation */
- if (cxld->id != *id)
- return 0;
+ if (cxld->id == port->commit_end + 1) {
+ if (!cxld->region) {
+ *target_dev = dev;
+ return 1;
+ }

- if (!cxld->region)
- return 1;
+ dev_dbg(dev->parent,
+ "%s: cxld '%s' was reserved by '%s'\n", __func__,
+ dev_name(dev), dev_name(&cxld->region->dev));
+ return -EBUSY;
+ }

- (*id)++;
+ if (cxld->flags & CXL_DECODER_F_ENABLE)
+ return 0;

- return 0;
+ dev_dbg(dev->parent, "%s: cxld '%s' was out of order shut down\n",
+ __func__, dev_name(dev));
+ return -EINVAL;
}

static int match_auto_decoder(struct device *dev, void *data)
@@ -839,17 +849,18 @@ cxl_region_find_decoder(struct cxl_port *port,
struct cxl_endpoint_decoder *cxled,
struct cxl_region *cxlr)
{
- struct device *dev;
- int id = 0;
+ struct device *dev = NULL;

if (port == cxled_to_port(cxled))
return &cxled->cxld;

- if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
+ if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
dev = device_find_child(&port->dev, &cxlr->params,
match_auto_decoder);
- else
- dev = device_find_child(&port->dev, &id, match_free_decoder);
+ } else {
+ device_for_each_child(&port->dev, &dev, match_free_decoder);
+ get_device(dev);
+ }
if (!dev)
return NULL;
/*

---
base-commit: 9bd133f05b1dca5ca4399a76d04d0f6f4d454e44
change-id: 20240811-const_dfc_prepare-3ff23c6598e5

Best regards,
--
Zijun Hu <quic_zijuhu@xxxxxxxxxxx>