Re: [PATCH v19 net-next 2/9] octeontx2-af: npc: cn20k: debugfs enhancements

From: Ratheesh Kannoth

Date: Sun Jun 07 2026 - 22:21:55 EST


On 2026-06-05 at 12:02:38, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
> so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: for enabled MCAM indices, print hit deltas
> since the prior read by comparing hardware counters to a per-entry
> software baseline and advancing that baseline after each read (hardware
> counters are not cleared).
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
> but not explicitly allocated, helping diagnose allocation/field issues.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>

https://sashiko.dev/#/patchset/20260605063245.3553861-1-rkannoth%40marvell.com says

>> +
>> + snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> + mcam_idx, pf, delta);
>> + seq_puts(s, buff);
>> +
>> + dstats[bank][idx] = stats;
>Does this loop read and write out-of-bounds of the dstats array if bank_depth
>is larger than expected? As discussed below in npc_priv_init(), bank_depth
>can bypass validation. If that happens, iterating up to bank_depth - 1 will
>exceed the fixed MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS array dimensions.

npc_priv_init() do check whether within limits during probe()

>Can stale baselines in dstats cause inaccurate delta calculations when MCAM
>entries are freed and reallocated?
>Can stale baselines in dstats cause inaccurate delta calculations when MCAM
>entries are freed and reallocated?

Yes. but it is OK as the dstats is used to debug field issues. First time
it may show wrong data, but second time ownwards, data will be correct as s/w baseline
array updates to the hw stats.

>While the reset logic attempts to handle this with:
>if (stats < dstats[bank][idx]) dstats[bank][idx] = 0;
>This reset will fail to trigger if a newly allocated entry accumulates hits
>greater than or equal to the stale baseline before the very first debugfs
>read. In this scenario, the stale baseline would be silently subtracted,
>drastically undercounting the statistics.

No. dstats[0][bank][idx] = stats line sets s/w array baseline to match hw
stats.

> @@ -842,6 +842,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
>> cfg);
>> }
>>
>> +update_en_map:
>> + if (enable)
>> + set_bit(index, npc_priv.en_map);
>> + else
>> + clear_bit(index, npc_priv.en_map);
>> +
>Can index exceed the bounds of the en_map bitmap?
>Because en_map is allocated for exactly MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>MAX_SUBBANK_DEPTH (16384) bits, an index corresponding to an oversized
>bank_depth (as described in npc_priv_init() below) could cause set_bit() to
>silently corrupt adjacent fields in struct npc_priv_t.

No. We do check in npc_priv_init() (during probe()) against these macros (MAX_NUM_BANKS,
MAX_NUM_SUB_BANKS, MAX_SUBBANK_DEPTH). if the hw read values exceed these, probe()
itself will fail.

>> }
>>
>> - npc_priv.num_subbanks = num_subbanks;
>> -
>> subbank_depth = bank_depth / num_subbanks;
>> + if (subbank_depth > MAX_SUBBANK_DEPTH) {
>> + dev_err(rvu->dev,
>> + "Invalid subbank depth %u\n",
>> + subbank_depth);
>> + return -EINVAL;
>> + }
>>
>> + npc_priv.num_subbanks = num_subbanks;
>> npc_priv.bank_depth = bank_depth;
>> npc_priv.subbank_depth = subbank_depth;
>Does integer division truncation allow an invalid bank_depth to bypass this
>check?
This is the case when hw read value is 0. Even though this case wont happen in hw,
we will post fix patch after this series to check agaist 0 and return err (during
probe())

>If the hardware-provided bank_depth is not an exact multiple of num_subbanks
>(for example, if bank_depth is 8223 and num_subbanks is 32), subbank_depth
>truncates to 256. This passes the MAX_SUBBANK_DEPTH check, allowing bank_depth
>to remain 8223. This oversized bank_depth then drives loops and calculations
>in other functions, leading to the out-of-bounds accesses in debugfs and the
>en_map bitmap operations highlighted above.
In all SoCs, bank_depth is an exact multiple of num_banks. We can add a check
in npc_priv_init() during probe() (as a hardening series to net-next after this
patch is merged)