Re: [PATCH] dmaengine: idxd: fix use-after-free in idxd_free() and idxd_alloc() error paths
From: Frank Li
Date: Wed Jun 17 2026 - 17:09:45 EST
On Mon, Jun 15, 2026 at 01:39:32PM +0300, Bogdan Codres (Wind River) wrote:
> From: Bogdan Codres <bogdan.codres@xxxxxxxxxxxxx>
>
> We have the following backtrace:
> [ 18.628791] idxd 0000:00:01.0: Device is HALTED!
> [ 18.631447] idxd 0000:00:01.0: Intel(R) IDXD DMA Engine init failed
> [ 18.631450] ------------[ cut here ]------------
> [ 18.631451] ida_free called for id=0 which is not allocated.
> [ 18.631462] WARNING: CPU: 0 PID: 11 at lib/idr.c:525 ida_free+0xd3/0x130
> [ 18.631467] Modules linked in: idxd(+) idxd_bus wmi zl3073x_spi regmap_spi zl3073x_i2c zl3073x i2c_mux_pca954x i2c_mux ipmi_si acpi_power_meter i2c_designware_platform i2c_designware_core acpi_ipmi ipmi_devintf ipmi_msghandler
> [ 18.631474] CPU: 0 UID: 0 PID: 11 Comm: kworker/0:1 Not tainted 6.12.0-1-rt-amd64 #1 Debian 6.12.40-1.stx.140
> [ 18.631477] Hardware name: Dell Inc. PowerEdge XR8720t/0J91KV, BIOS 1.1.3 02/03/2026
> [ 18.631478] Workqueue: events work_for_cpu_fn
> [ 18.631480] RIP: 0010:ida_free+0xd3/0x130
> [ 18.631482] Code: 62 ff 31 f6 48 89 e7 e8 bb 1b 02 00 eb 5a 83 fb 3e 76 36 48 8b 3c 24 e8 ab 74 03 00 89 ee 48 c7 c7 70 d6 bd b4 e8 7d 1e 36 ff <0f> 0b 48 8b 44 24 38 65 48 2b 04 25 28 00 00 00 75 37 48 83 c4 40
> [ 18.631484] RSP: 0018:ff59485680267d58 EFLAGS: 00010282
> [ 18.631485] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffb53064c8
> [ 18.631486] RDX: 0000000000020940 RSI: 0000000000000000 RDI: ffffffffb53365d0
> [ 18.631487] RBP: 0000000000000000 R08: 0000000000000000 R09: ff59485680267b40
> [ 18.631487] R10: ff59485680267b38 R11: ffffffffb5336508 R12: 0000000000000000
> [ 18.631488] R13: ff2c9dd3800730c8 R14: 0000000000000000 R15: ff2c9dd38385d800
> [ 18.631489] FS: 0000000000000000(0000) GS:ff2c9dd3fdc00000(0000) knlGS:0000000000000000
> [ 18.631490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 18.631491] CR2: 000055e2e7678098 CR3: 0000002003450005 CR4: 0000000000771ef0
> [ 18.631492] PKRU: 55555554
> [ 18.631492] Call Trace:
> [ 18.631494] <TASK>
> [ 18.631495] idxd_pci_probe+0x1b0/0x1860 [idxd]
> [ 18.631502] ? set_next_entity+0xcb/0x1b0
> [ 18.631506] local_pci_probe+0x43/0xa0
> [ 18.631508] work_for_cpu_fn+0x13/0x20
> [ 18.631510] process_one_work+0x179/0x390
> [ 18.631512] worker_thread+0x237/0x340
> [ 18.631515] ? __pfx_worker_thread+0x10/0x10
> [ 18.631517] kthread+0xc6/0x100
> [ 18.631519] ? __pfx_kthread+0x10/0x10
> [ 18.631520] ret_from_fork+0x2d/0x50
> [ 18.631523] ? __pfx_kthread+0x10/0x10
> [ 18.631524] ret_from_fork_asm+0x1a/0x30
> [ 18.631526] </TASK>
> [ 18.631527] ---[ end trace 0000000000000000 ]---
>
> When an IDXD device probe fails (e.g., device is HALTED), the error
> path in idxd_pci_probe() calls idxd_free() which performs:
>
> 1. put_device(idxd_confdev(idxd))
> 2. bitmap_free(idxd->opcap_bmap)
> 3. ida_free(&idxd_ida, idxd->id)
> 4. kfree(idxd)
>
> However, since device_initialize() was already called in idxd_alloc(),
> the conf_dev has a refcount of 1. The put_device() in step 1 drops
> this to 0 and synchronously invokes idxd_conf_device_release() via:
>
> put_device() -> kobject_put() -> kobject_release() -> kobject_cleanup()
> -> device_release() -> dev->type->release -> idxd_conf_device_release()
>
> idxd_conf_device_release() already performs:
>
> ida_free(&idxd_ida, idxd->id);
> bitmap_free(idxd->opcap_bmap);
> kfree(idxd);
>
> Therefore steps 2-4 in idxd_free() operate on already-freed memory:
> - step 2: bitmap_free on dangling pointer (use-after-free)
> - step 3: ida_free on already-released ID, triggering:
> "ida_free called for id=0 which is not allocated"
> - step 4: double kfree() corrupts slab freelist metadata
>
> This is consistent with the pattern established in commit
> c311f5e9248471a950 ("dmaengine: idxd: Fix freeing the allocated ida
> too late") where ida_free() was removed from the cdev .release()
> callback because resources must not be freed in both the .release()
> callback and the caller of put_device().
The basically it is simple double free problem, can you summary it in
commit message to keep short and leave key message.
>
> The path is extremely rare in normal operation because:
> 1. IDXD probe only fails when the device is in HALTED state
> 2. The device enters HALTED state exclusively after reset_devices
> (kdump boot parameter) or unrecoverable hardware error
> 3. On a normally running system, IDXD probe always succeeds
This is not related this patch, regardless how rare, it is code logic
problem. You can inject error anyways.
>
> Fixes: 90022b3a6981 ("dmaengine: idxd: fix memory leak in error handling path of idxd_pci_probe")
> Fixes: 46a5cca76c76 ("dmaengine: idxd: fix memory leak in error handling path of idxd_alloc")
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
> Cc: Dave Jiang <dave.jiang@xxxxxxxxx>
> Cc: Vinicius Costa Gomes <vinicius.gomes@xxxxxxxxx>
> Cc: Vinod Koul <vkoul@xxxxxxxxxx>
> Cc: Yi Sun <yi.sun@xxxxxxxxx>
> Cc: Fenghua Yu <fenghuay@xxxxxxxxxx>
> Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
except CC: stable, other should be put after ---
> Signed-off-by: Bogdan Codres <bogdan.codres@xxxxxxxxxxxxx>
> ---
> drivers/dma/idxd/init.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index e55136bb5..b76f0d12b 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -586,15 +586,18 @@ static void idxd_read_caps(struct idxd_device *idxd)
> idxd->hw.iaa_cap.bits = ioread64(idxd->reg_base + IDXD_IAACAP_OFFSET);
> }
>
> +/*
> + * Release an idxd device that was allocated (device_initialize() was called)
> + * but never successfully registered. put_device() drops the last reference and
> + * triggers idxd_conf_device_release() which frees all resources including the
> + * ida, opcap_bmap, and the idxd structure itself.
> + */
> static void idxd_free(struct idxd_device *idxd)
> {
> if (!idxd)
> return;
>
> put_device(idxd_confdev(idxd));
> - bitmap_free(idxd->opcap_bmap);
> - ida_free(&idxd_ida, idxd->id);
> - kfree(idxd);
> }
>
> static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_data *data)
> @@ -634,13 +637,16 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
> return idxd;
>
> err_name:
> + /* device_initialize() was called, so put_device() will trigger
> + * idxd_conf_device_release() which frees ida, opcap_bmap, and idxd.
> + * Do not fall through to err_opcap/err_ida.
> + */
> put_device(conf_dev);
> - bitmap_free(idxd->opcap_bmap);
> + return NULL;
> err_opcap:
> ida_free(&idxd_ida, idxd->id);
> err_ida:
> kfree(idxd);
> -
unnecessary change
Frank
> return NULL;
> }
>
> --
> 2.43.0
>