Re: [PATCH] remoteproc: core: Clear table_sz when rproc_shutdown
From: Arnaud POULIQUEN
Date: Wed Mar 19 2025 - 08:44:23 EST
Hello Peng,
On 3/19/25 11:01, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@xxxxxxx>
>
> There is case as below could trigger kernel dump:
> Use U-Boot to start remote processor(rproc) with resource table
> published to a fixed address by rproc. After Kernel boots up,
> stop the rproc, load a new firmware which doesn't have resource table
> ,and start rproc.
>
> When starting rproc with a firmware not have resource table,
> `memcpy(loaded_table, rproc->cached_table, rproc->table_sz)` will
> trigger dump, because rproc->cache_table is set to NULL during the last
> stop operation, but rproc->table_sz is still valid.
>
> This issue is found on i.MX8MP and i.MX9.
>
> Dump as below:
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> Mem abort info:
> ESR = 0x0000000096000004
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x04: level 0 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=000000010af63000
> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 2 UID: 0 PID: 1060 Comm: sh Not tainted 6.14.0-rc7-next-20250317-dirty #38
> Hardware name: NXP i.MX8MPlus EVK board (DT)
> pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __pi_memcpy_generic+0x110/0x22c
> lr : rproc_start+0x88/0x1e0
> Call trace:
> __pi_memcpy_generic+0x110/0x22c (P)
> rproc_boot+0x198/0x57c
> state_store+0x40/0x104
> dev_attr_store+0x18/0x2c
> sysfs_kf_write+0x7c/0x94
> kernfs_fop_write_iter+0x120/0x1cc
> vfs_write+0x240/0x378
> ksys_write+0x70/0x108
> __arm64_sys_write+0x1c/0x28
> invoke_syscall+0x48/0x10c
> el0_svc_common.constprop.0+0xc0/0xe0
> do_el0_svc+0x1c/0x28
> el0_svc+0x30/0xcc
> el0t_64_sync_handler+0x10c/0x138
> el0t_64_sync+0x198/0x19c
>
> Clear rproc->table_sz to address the issue.
>
> Fixes: 9dc9507f1880 ("remoteproc: Properly deal with the resource table when detaching")
> Signed-off-by: Peng Fan <peng.fan@xxxxxxx>
> ---
>
> V1:
> There is the other fix that I could do is to clear rproc->table_sz
> in imx_rproc_parse_fw, but I think this issue should be common to others.
> So do this change in rproc_shutdown. Since it is in rproc_shutdown,
> clearing table_sz should not incur new issues.
>
> The kernel dump is found by Jacky Bai in NXP internal test, so not add
> tag in public list here. Jacky, feel free to send a Reported-by in community.
>
> drivers/remoteproc/remoteproc_core.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index c2cf0d277729..b21eedefff87 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -2025,6 +2025,7 @@ int rproc_shutdown(struct rproc *rproc)
> kfree(rproc->cached_table);
> rproc->cached_table = NULL;
> rproc->table_ptr = NULL;
> + rproc->table_sz = 0;
Your fix makes sense from my point of view.
It seems that you should also apply this fix in rproc_detach() and rproc_fw_boot().
Regards,
Arnaud
> out:
> mutex_unlock(&rproc->lock);
> return ret;