Re: [PATCH v14 net-next 1/9] octeontx2-af: npc: cn20k: debugfs enhancements

From: Ratheesh Kannoth

Date: Fri May 15 2026 - 03:28:34 EST

On 2026-05-14 at 11:55:29, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> Improve MCAM visibility and field debugging for CN20K NPC.
>
> - Extend "mcam_layout" to show enabled (+) or disabled state per entry
> so status can be verified without parsing the full "mcam_entry" dump.
> - Add "dstats" debugfs entry: reports recently hit MCAM indices with
> packet counts; stats are cleared on read so each read shows deltas.
> - Add "mismatch" debugfs entry: lists MCAM entries that are enabled
> but not explicitly allocated, helping diagnose allocation/field issues.
>

>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
>> index 3d5eb952cc073..9567a2d80b584 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.h
>[ ... ]
>> @@ -193,6 +194,9 @@ struct npc_priv_t {
>> const int num_banks;
>> int num_subbanks;
>> int subbank_depth;
>> + DECLARE_BITMAP(en_map, MAX_NUM_BANKS *
>> + MAX_NUM_SUB_BANKS *
>> + MAX_SUBBANK_DEPTH);
>[ ... ]
>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> index 6b3f453fd5004..e9aad0ad3fa68 100644
>> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/npc.c
>[ ... ]
>> @@ -842,6 +842,12 @@ npc_cn20k_enable_mcam_entry(struct rvu *rvu, int blkaddr,
>> cfg);
>> }
>>
>> +update_en_map:
>> + if (enable)
>> + set_bit(index, npc_priv.en_map);
>> + else
>> + clear_bit(index, npc_priv.en_map);
>> +
>> return 0;
>> }
>Will index ever exceed the static size of en_map?
>If the hardware reports larger values, could the set_bit() call here perform
>an out-of-bounds write and corrupt memory?
>

The index is guaranteed to stay within the static bounds of en_map.

The bitmap dimensions are calculated using the constants:
MAX_NUM_BANKS * MAX_NUM_SUB_BANKS * MAX_SUBBANK_DEPTH. This product
represents the absolute maximum theoretical MCAM capacity for the
CN20K NPC architecture.

Furthermore, the index parameter is strictly validated upstream
within the RVU admin function layer (specifically via the resource
allocation checks in npc_get_mcam_idx) before reaching this function.
Because the hardware geometry is fixed and the index is validated
against the discovered capacity during initialization, an
out-of-bounds access is impossible.

>> +static u64 dstats[MAX_NUM_BANKS][MAX_SUBBANK_DEPTH * MAX_NUM_SUB_BANKS] = {};
>> +static int npc_mcam_dstats_show(struct seq_file *s, void *unused)
>> +{
>[ ... ]
>> + delta = stats - dstats[bank][idx];
>> +
>> + snprintf(buff, sizeof(buff), "%u\t%#04x\t%llu\n",
>> + mcam_idx, pf, delta);
>> + seq_puts(s, buff);
>> +
>> + dstats[bank][idx] = stats;
>> + }
>> + }
>> +
>> + mutex_unlock(&stats_lock);
>> + return 0;
>> +}
>Is it safe to modify state within a seq_file show callback?
>The kernel's seq_file framework expects the show function to be side-effect
>free. If the formatted output exceeds the initial buffer size, seq_read_iter()
>discards the buffer, allocates a larger one, and re-invokes the show function.
>If that happens, wouldn't the retried invocation calculate a delta of zero
>because dstats was already updated in the first attempt, causing the
>statistics to be lost?

The seq_file show callback state mutation is safe in this context
because the implementation enforces a single-allocation guarantee
upstream via single_open_size().

During the .open file operation, single_open_size() is invoked with a
pre-calculated total size sufficient to contain the entire output
payload for all banks. This forces the seq_file framework to allocate
the full required buffer upfront.

Because the sequence allocation size matches the total data footprint,
seq_read_iter() will never encounter an overflow condition, avoiding the
discard-and-retry path entirely. Consequently, the show() callback is
guaranteed to execute exactly once per sequence read, eliminating the
possibility of calculating a zero delta on a re-entrant call.

>Additionally, dstats is declared as a global static array. If there are
>multiple RVU device instances, won't they overwrite each other's state and
>corrupt the delta calculations?
>The subsequent commit "octeontx2-af: npc: cn20k: Allocate npc_priv and dstats
>dynamically." might address the global array issue, but the seq_file state
>mutation appears to remain.

the system topology guarantees there is only a single RVU Admin Function (AF) device instance
active during the system life cycle. Because there are no concurrent
or multiple RVU AF instances to cause cross-device data corruption, the
global static array does not introduce an overwrite or race condition
between separate devices.

Moving dstats to a dynamic per-device structure in the subsequent patch
remains a structural improvement, but the logic in this transient commit
is functionally sound and free of data corruption risks.