Re: [PATCH v5 net 03/10] octeontx2-af: npc: cn20k: Propagate errors in defrag MCAM alloc rollback

From: Ratheesh Kannoth

Date: Thu Apr 30 2026 - 00:13:49 EST


On 2026-04-29 at 07:57:15, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> npc_defrag_alloc_free_slots() allocates MCAM indexes in up to two passes on
> bank0 then bank1. On failure it rolls back by freeing entries already
> placed in save[].
>
> __npc_subbank_alloc() can return a negative errno while only part of the
> indexes are valid. The rollback loop used rc for
> npc_mcam_idx_2_subbank_idx() as well, so a successful lookup stored zero in
> rc and a later __npc_subbank_free() failure could still end with return 0
> when the allocation path had also left rc at zero (for example shortfall
> after zero return values from the alloc helpers).
>
> Jump to the rollback path immediately when either __npc_subbank_alloc()
> call fails, preserving its errno. If both calls succeed but the total
> allocated count is still less than cnt, set rc to -ENOSPC before rollback.
> Use a separate err variable for npc_mcam_idx_2_subbank_idx() so a
> successful lookup no longer clears a non-zero rc from the allocation phase.
>

>>could the commit message be updated, or are there missing checks for the
>>return value of __npc_subbank_alloc() in the code?
We clearly mentioned that if both fails rc is set to -ENOSPC.

> @@ -3529,6 +3530,7 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu,
> NPC_MCAM_LOWER_PRIO,
> false, cnt, save, cnt, true,
> &alloc_cnt1);
> +
> if (alloc_cnt1 < cnt) {
> rc = __npc_subbank_alloc(rvu, sb,
> NPC_MCAM_KEY_X2, sb->b1b,
>If the first __npc_subbank_alloc() call fails with an error like -ENOMEM,
>alloc_cnt1 is set to 0.
>Would this make the alloc_cnt1 < cnt check evaluate to true, causing the
>code to attempt the second allocation instead of jumping to the rollback
>path immediately?
Roll back is already done inside __npc_subbank_alloc(). So no need to rollback.

> @@ -3544,15 +3546,17 @@ static int npc_defrag_alloc_free_slots(struct rvu *rvu,
> dev_err(rvu->dev,
> "%s: Failed to alloc cnt=%u alloc_cnt1=%u alloc_cnt2=%u\n",
> __func__, cnt, alloc_cnt1, alloc_cnt2);
> + rc = -ENOSPC;
> goto fail_free_alloc;
> }
>If the allocation fails and sets rc to the actual error code, won't this
>block unconditionally overwrite the preserved errno with -ENOSPC before
>jumping to the rollback path?
You are correct that if the first call returns an error, alloc_cnt1 is 0 and we proceed to the second call. This is intentional. The defrag logic
attempts to gather the required total entries (cnt) across multiple sub-banks. If the first bank fails or is full,
we must still try the second bank to see if the combined total can satisfy the request. Since __npc_subbank_alloc()
handles its own internal rollback on failure, there is no leaked state when we 'fall through' to the second attempt.The use of -ENOSPC here is intentional
for the defrag context. Even if a sub-call returned -ENOMEM, the high-level failure at this stage is that the hardware simply does not have
the contiguous space required to perform the defrag move.