Re: [PATCH v2 2/2] net: fman: use devm_kzalloc() for fman and rely on devres

From: 赵金明

Date: Tue Jun 23 2026 - 02:18:39 EST


Hi Andrew,

Thank you for pointing me to the netdev maintainer documentation. I have
read section 1.7.4 and I understand the concern about standalone
cleanup conversions.

I would like to clarify the actual motivation behind the
devm_kzalloc() change. While it may appear to be a simple devm_
conversion on the surface, it is in fact fixing a use-after-free race
condition in the IRQF_SHARED error paths. Let me explain the problem
in detail.

PROBLEM
=======

In read_dts_node(), the driver allocates fman with kzalloc_obj() and
then registers shared interrupt handlers via devm_request_irq(), with
fman passed as the dev_id:

L2700: fman = kzalloc_obj(*fman);
...
L2774: devm_request_irq(&of_dev->dev, irq, fman_irq,
IRQF_SHARED, "fman", fman);
L2783: devm_request_irq(&of_dev->dev, err_irq, fman_err_irq,
IRQF_SHARED, "fman-err", fman);

The devres list after successful registration of these resources is:

[fman (manual)] -> [main_irq handler] -> [err_irq handler] -> [ioremap]

There are three error paths after main_irq has been registered that
hit the fman_free label and call kfree(fman):

- devm_request_irq(err_irq) fails (L2779) -- main_irq still registered
- devm_platform_get_and_ioremap_resource() fails (L2797) -- both IRQs registered
- of_platform_populate() fails (L2809) -- both IRQs registered

Note that on the err_irq failure path (L2779), err_irq was not
successfully registered, so only main_irq remains in the devres list.
On the other two paths, both IRQ handlers are registered.

Taking of_platform_populate() failure as an example, the full error
cleanup sequence is:

1. of_platform_populate() fails
2. goto fman_free
3. kfree(fman) -- fman is freed, but dev_id
still points to this memory
4. read_dts_node() returns ERR_PTR(err)
5. fman_probe() returns error code
6. Driver core calls device_unbind_cleanup() -> devres_release_all(),
which releases devres resources in LIFO order:
- Step 6a: free ioremap
- Step 6b: devm_free_irq(err_irq) -- err_irq handler unregistered
- Step 6c: devm_free_irq(main_irq) -- main_irq handler unregistered

Until step 6c completes, at least one IRQ handler remains registered
with the interrupt subsystem. During the window between step 3
(kfree) and step 6c, a shared IRQ may fire and the handler will
dereference the already-freed fman.

Because the handlers are registered with IRQF_SHARED, the kernel will
call fman_irq() and fman_err_irq() even when the interrupt is triggered
by another device sharing the same IRQ line.

Both handlers immediately dereference the fman pointer:

static irqreturn_t fman_irq(int irq, void *handle)
{
struct fman *fman = (struct fman *)handle;
if (!is_init_done(fman->cfg)) -- dereferences freed fman
return IRQ_NONE;
...
}

The same issue exists in fman_config(). Its err_fm_state error path
calls kfree(fman) while the devm IRQ handlers registered earlier in
read_dts_node() are still active:

err_fm_state:
kfree(fman); -- free fman
return -EINVAL; -- devres cleanup runs after return

When fman_config() fails, fman_probe() returns -EINVAL at L2841 without
any cleanup, and the driver core then calls devres_release_all() which
releases the IRQ handlers -- again after fman has already been freed.

HOW devm_kzalloc() FIXES IT
============================

By allocating fman with devm_kzalloc(), it becomes the first entry in
the devres list:

devres list: [fman] -> [main_irq] -> [err_irq] -> [ioremap]

Devres releases resources in LIFO order:

1. free ioremap
2. devm_free_irq(err_irq) -- handlers unregistered
3. devm_free_irq(main_irq) -- handlers unregistered
4. devm_kfree(fman) -- fman freed last, no UAF

Since fman is freed only after all IRQ handlers have been unregistered,
the use-after-free window is completely eliminated.

The manual kfree(fman) calls must also be removed to avoid double-free,
which is why they are dropped in this patch along with the allocation
change.

I will rework the patch with a commit message that clearly describes
this as a UAF fix rather than a cleanup conversion. Please let me know
if this explanation addresses the concern.

Best regards,
ZhaoJinming




>On Mon, Jun 22, 2026 at 05:05:05PM +0800, ZhaoJinming wrote:



>> The driver now allocates the top-level struct fman with devm_kzalloc()



>> so that its lifetime is bound to the device and resources are released



>> automatically by the driver core on probe failure or device removal.



>>



>> Remove the explicit kfree(fman) from the error paths in fman_config()



>> and read_dts_node() to avoid double-free/use-after-free and to follow



>> the devm_ allocation convention.



>>



>> After of_find_matching_node() consumes fm_node's reference via



>> of_node_put(from), the post-muram error paths no longer need to clean



>> up fm_node, so replace goto fman_free with direct return ERR_PTR(err).



>>



>> This change complements the existing use of devm_* resources (irq,



>> ioremap, etc.) and simplifies the error handling paths.



>>



>> Signed-off-by: ZhaoJinming <zhaojinming@xxxxxxxxxxxxx>



>



>Please take a read of:



>



>https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html



>



>Please read it all, but see section 1.7.4.



>



>??? Andrew



>



>---



>pw-bot: cr



>