Re: [PATCH v2 2/2] net: fman: use devm_kzalloc() for fman and rely on devres
From: 赵金明
Date: Tue Jun 23 2026 - 02:18:39 EST
Hi Andrew,
Thank you for pointing me to the netdev maintainer documentation. I have
read section 1.7.4 and I understand the concern about standalone
cleanup conversions.
I would like to clarify the actual motivation behind the
devm_kzalloc() change. While it may appear to be a simple devm_
conversion on the surface, it is in fact fixing a use-after-free race
condition in the IRQF_SHARED error paths. Let me explain the problem
in detail.
PROBLEM
=======
In read_dts_node(), the driver allocates fman with kzalloc_obj() and
then registers shared interrupt handlers via devm_request_irq(), with
fman passed as the dev_id:
L2700: fman = kzalloc_obj(*fman);
...
L2774: devm_request_irq(&of_dev->dev, irq, fman_irq,
IRQF_SHARED, "fman", fman);
L2783: devm_request_irq(&of_dev->dev, err_irq, fman_err_irq,
IRQF_SHARED, "fman-err", fman);
The devres list after successful registration of these resources is:
[fman (manual)] -> [main_irq handler] -> [err_irq handler] -> [ioremap]
There are three error paths after main_irq has been registered that
hit the fman_free label and call kfree(fman):
- devm_request_irq(err_irq) fails (L2779) -- main_irq still registered
- devm_platform_get_and_ioremap_resource() fails (L2797) -- both IRQs registered
- of_platform_populate() fails (L2809) -- both IRQs registered
Note that on the err_irq failure path (L2779), err_irq was not
successfully registered, so only main_irq remains in the devres list.
On the other two paths, both IRQ handlers are registered.
Taking of_platform_populate() failure as an example, the full error
cleanup sequence is:
1. of_platform_populate() fails
2. goto fman_free
3. kfree(fman) -- fman is freed, but dev_id
still points to this memory
4. read_dts_node() returns ERR_PTR(err)
5. fman_probe() returns error code
6. Driver core calls device_unbind_cleanup() -> devres_release_all(),
which releases devres resources in LIFO order:
- Step 6a: free ioremap
- Step 6b: devm_free_irq(err_irq) -- err_irq handler unregistered
- Step 6c: devm_free_irq(main_irq) -- main_irq handler unregistered
Until step 6c completes, at least one IRQ handler remains registered
with the interrupt subsystem. During the window between step 3
(kfree) and step 6c, a shared IRQ may fire and the handler will
dereference the already-freed fman.
Because the handlers are registered with IRQF_SHARED, the kernel will
call fman_irq() and fman_err_irq() even when the interrupt is triggered
by another device sharing the same IRQ line.
Both handlers immediately dereference the fman pointer:
static irqreturn_t fman_irq(int irq, void *handle)
{
struct fman *fman = (struct fman *)handle;
if (!is_init_done(fman->cfg)) -- dereferences freed fman
return IRQ_NONE;
...
}
The same issue exists in fman_config(). Its err_fm_state error path
calls kfree(fman) while the devm IRQ handlers registered earlier in
read_dts_node() are still active:
err_fm_state:
kfree(fman); -- free fman
return -EINVAL; -- devres cleanup runs after return
When fman_config() fails, fman_probe() returns -EINVAL at L2841 without
any cleanup, and the driver core then calls devres_release_all() which
releases the IRQ handlers -- again after fman has already been freed.
HOW devm_kzalloc() FIXES IT
============================
By allocating fman with devm_kzalloc(), it becomes the first entry in
the devres list:
devres list: [fman] -> [main_irq] -> [err_irq] -> [ioremap]
Devres releases resources in LIFO order:
1. free ioremap
2. devm_free_irq(err_irq) -- handlers unregistered
3. devm_free_irq(main_irq) -- handlers unregistered
4. devm_kfree(fman) -- fman freed last, no UAF
Since fman is freed only after all IRQ handlers have been unregistered,
the use-after-free window is completely eliminated.
The manual kfree(fman) calls must also be removed to avoid double-free,
which is why they are dropped in this patch along with the allocation
change.
I will rework the patch with a commit message that clearly describes
this as a UAF fix rather than a cleanup conversion. Please let me know
if this explanation addresses the concern.
Best regards,
ZhaoJinming
>On Mon, Jun 22, 2026 at 05:05:05PM +0800, ZhaoJinming wrote:
>> The driver now allocates the top-level struct fman with devm_kzalloc()
>> so that its lifetime is bound to the device and resources are released
>> automatically by the driver core on probe failure or device removal.
>>
>> Remove the explicit kfree(fman) from the error paths in fman_config()
>> and read_dts_node() to avoid double-free/use-after-free and to follow
>> the devm_ allocation convention.
>>
>> After of_find_matching_node() consumes fm_node's reference via
>> of_node_put(from), the post-muram error paths no longer need to clean
>> up fm_node, so replace goto fman_free with direct return ERR_PTR(err).
>>
>> This change complements the existing use of devm_* resources (irq,
>> ioremap, etc.) and simplifies the error handling paths.
>>
>> Signed-off-by: ZhaoJinming <zhaojinming@xxxxxxxxxxxxx>
>
>Please take a read of:
>
>https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
>
>Please read it all, but see section 1.7.4.
>
>??? Andrew
>
>---
>pw-bot: cr
>