Re: [PATCH rdma-next] RDMA/cgroup: fix resource leak in DRIVER_FAILURE cleanup path

From: Leon Romanovsky

Date: Wed Jun 10 2026 - 04:58:11 EST


On Mon, May 18, 2026 at 11:15:41AM +0800, Tao Cui wrote:
> When a driver fails to destroy an RDMA object during ufile cleanup,
> the kernel retries and eventually falls back to the
> RDMA_REMOVE_DRIVER_FAILURE path. This path sets obj->object = NULL
> before calling uverbs_destroy_uobject(), which skips the destroy_hw
> callback. Since ib_rdmacg_uncharge() lives inside destroy_hw_idr_uobject(),
> the HCA_OBJECT cgroup charge is never released.
>
> Add an explicit ib_rdmacg_uncharge() call in the DRIVER_FAILURE path
> to prevent the resource counter leak.

It is not the correct approach. A cgroup controls how many resources a
task may consume, and a "failure to release" indicates that the resource
usage is still being accounted to that task.

Thanks

>
> Signed-off-by: Tao Cui <cuitao@xxxxxxxxxx>
> ---
> drivers/infiniband/core/rdma_core.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
> index 5018ec837056..347ec8f6976b 100644
> --- a/drivers/infiniband/core/rdma_core.c
> +++ b/drivers/infiniband/core/rdma_core.c
> @@ -917,8 +917,11 @@ static int __uverbs_cleanup_ufile(struct ib_uverbs_file *ufile,
> * racing with a lookup_get.
> */
> WARN_ON(uverbs_try_lock_object(obj, UVERBS_LOOKUP_WRITE));
> - if (reason == RDMA_REMOVE_DRIVER_FAILURE)
> + if (reason == RDMA_REMOVE_DRIVER_FAILURE) {
> obj->object = NULL;
> + ib_rdmacg_uncharge(&obj->cg_obj, ib_dev,
> + RDMACG_RESOURCE_HCA_OBJECT);
> + }
> if (!uverbs_destroy_uobject(obj, reason, &attrs))
> ret = 0;
> else
> --
> 2.43.0
>