Re: [PATCH] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)

From: Robin Murphy
Date: Tue Feb 06 2024 - 05:00:53 EST


On 2024-02-05 7:46 pm, Ilkka Koskinen wrote:
AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
report incorrect child count. The failing crosspoints report 8 children
while they only have two.

Ooh, fun :)

When the driver tries to access the inexistent child nodes, it believes it
has reached an invalid node type and probing fails. The workaround is to
ignore those incorrect child nodes and continue normally.

Signed-off-by: Ilkka Koskinen <ilkka@xxxxxxxxxxxxxxxxxxxxxx>
---
drivers/perf/arm-cmn.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index c584165b13ba..97fed8ec3693 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -2168,6 +2168,23 @@ static enum cmn_node_type arm_cmn_subtype(enum cmn_node_type type)
}
}
+static inline bool arm_cmn_is_ampereonex_bug(const struct arm_cmn *cmn,
+ struct arm_cmn_node *dn,
+ u16 child_count, int child)
+{
+ /*
+ * The bug occurs only when a crosspoint reports 8 children
+ * while it only has two HN-P child nodes.
+ */
+ dn -= 2;
+
+ if (arm_cmn_model(cmn) == CMN650 && child_count == 8 &&
+ child == 2 && dn->type == CMN_TYPE_HNP)
+ return true;
+
+ return false;
+}
+
static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
{
void __iomem *cfg_region;
@@ -2292,6 +2309,14 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
for (j = 0; j < child_count; j++) {
reg = readq_relaxed(xp_region + child_poff + j * 8);
+ if (reg == 0)
+ if (arm_cmn_is_ampereonex_bug(cmn, dn, child_count, j))
+ /*
+ * We know there are only two real children and the rest 6
+ * are inexistent. Thus, we can skip the rest of the loop
+ */
+ break;
+

TBH I don't see much harm in taking an even simpler approach, so I'd be
inclined to not bother being all that specific beyond documenting it,
something like the below:

Cheers,
Robin.

----->8-----

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index c584165b13ba..7e3aa7e2345f 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -2305,6 +2305,17 @@ static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
dev_dbg(cmn->dev, "ignoring external node %llx\n", reg);
continue;
}
+ /*
+ * AmpereOneX erratum AC04_MESH_1 makes some XPs report a bogus
+ * child count larger than the number of valid child pointers.
+ * A child offset of 0 can only occur on CMN-600; otherwise it
+ * would imply the root node being its own grandchild, which
+ * we can safely dismiss in general.
+ */
+ if (reg == 0 && cmn->part != PART_CMN600) {
+ dev_dbg(cmn->dev, "bogus child pointer?\n");
+ continue;
+ }
arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, dn);