Re: [bug report] Potential refcounting

From: Tariq Toukan

Date: Wed Apr 29 2026 - 04:37:26 EST




On 27/04/2026 5:07, Ginger wrote:
Dear Linux kernel maintainers,

My research-based static analyzer found a potential
refcounting/atomicity bug within the
'drivers/net/ethernet/mellanox/mlx4' subsystem, more specifically, in
'drivers/net/ethernet/mellanox/mlx4/cq.c'.

Kernel version: long-term kernel v6.18.9

Potential concurrent triggering executions:
T0:
mlx4_cq_tasklet_cb
--> if (refcount_dec_and_test(&mcq->refcount))
--> complete(&mcq->free)

T1:
mlx4_cq_completion
--> cq->comp(cq);
--> mlx4_add_cq_to_tasklet(struct mlx4_cq *cq)
--> spin_lock_irqsave(&tasklet_ctx->lock, flags);
--> refcount_inc(&cq->refcount);
--> spin_unlock_irqrestore(&tasklet_ctx->lock, flags);

In T1, the refcounting increment on 'cq->refcount)', although within
the protection range of the 'tasklet_ctx->locl', is not synchronized
against T0 because 'refcount_inc()' does not check whether the
refcount has reached zero in T0. This case is potentially problematic
because T0 decrements he 'mcq->refcount' and can enable the
'mlx4_cq_free()' to proceed.

Thank you for your time and consideration.

Best regards,
Ginger


Hi,

Thanks for your report.

IMO the described race is impossible.

CQs that work with mlx4_add_cq_to_tasklet as their comp() callback (i.e. T1) are added to the relevant list only after refcount is incremented.

Hence, if a CQ exists in the list in T0, it necessarily means that refcount is already elevated, and calling refcount_dec_and_test is safe.

Regards,
Tariq