Re: [PATCH V2] remoteproc: core: Clear table_sz when rproc_shutdown
From: Peng Fan
Date: Thu Mar 27 2025 - 23:42:17 EST
On Thu, Mar 27, 2025 at 11:46:33AM -0600, Mathieu Poirier wrote:
>Hi,
>
>On Wed, Mar 26, 2025 at 10:02:14AM +0800, Peng Fan (OSS) wrote:
>> From: Peng Fan <peng.fan@xxxxxxx>
>>
>> There is case as below could trigger kernel dump:
>> Use U-Boot to start remote processor(rproc) with resource table
>> published to a fixed address by rproc. After Kernel boots up,
>> stop the rproc, load a new firmware which doesn't have resource table
>> ,and start rproc.
>>
>
>If a firwmare image doesn't have a resouce table, rproc_elf_load_rsc_table()
>will return an error [1], rproc_fw_boot() will exit prematurely [2] and the
>remote processor won't be started. What am I missing?
STM32 and i.MX use their own parse_fw implementation which allows no resource
table:
https://elixir.bootlin.com/linux/v6.13.7/source/drivers/remoteproc/stm32_rproc.c#L272
https://elixir.bootlin.com/linux/v6.13.7/source/drivers/remoteproc/imx_rproc.c#L598
Thanks,
Peng
>
>[1]. https://elixir.bootlin.com/linux/v6.14-rc6/source/drivers/remoteproc/remoteproc_elf_loader.c#L338
>[2]. https://elixir.bootlin.com/linux/v6.14-rc6/source/drivers/remoteproc/remoteproc_core.c#L1411
>
>> When starting rproc with a firmware not have resource table,
>> `memcpy(loaded_table, rproc->cached_table, rproc->table_sz)` will
>> trigger dump, because rproc->cache_table is set to NULL during the last
>> stop operation, but rproc->table_sz is still valid.
>>
>> This issue is found on i.MX8MP and i.MX9.
>>
>> Dump as below:
>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>> Mem abort info:
>> ESR = 0x0000000096000004
>> EC = 0x25: DABT (current EL), IL = 32 bits
>> SET = 0, FnV = 0
>> EA = 0, S1PTW = 0
>> FSC = 0x04: level 0 translation fault
>> Data abort info:
>> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
>> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>> user pgtable: 4k pages, 48-bit VAs, pgdp=000000010af63000
>> [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
>> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>> Modules linked in:
>> CPU: 2 UID: 0 PID: 1060 Comm: sh Not tainted 6.14.0-rc7-next-20250317-dirty #38
>> Hardware name: NXP i.MX8MPlus EVK board (DT)
>> pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : __pi_memcpy_generic+0x110/0x22c
>> lr : rproc_start+0x88/0x1e0
>> Call trace:
>> __pi_memcpy_generic+0x110/0x22c (P)
>> rproc_boot+0x198/0x57c
>> state_store+0x40/0x104
>> dev_attr_store+0x18/0x2c
>> sysfs_kf_write+0x7c/0x94
>> kernfs_fop_write_iter+0x120/0x1cc
>> vfs_write+0x240/0x378
>> ksys_write+0x70/0x108
>> __arm64_sys_write+0x1c/0x28
>> invoke_syscall+0x48/0x10c
>> el0_svc_common.constprop.0+0xc0/0xe0
>> do_el0_svc+0x1c/0x28
>> el0_svc+0x30/0xcc
>> el0t_64_sync_handler+0x10c/0x138
>> el0t_64_sync+0x198/0x19c
>>
>> Clear rproc->table_sz to address the issue.
>>
>> While at here, also clear rproc->table_sz when rproc_fw_boot and
>> rproc_detach.
>>
>> Fixes: 9dc9507f1880 ("remoteproc: Properly deal with the resource table when detaching")
>> Signed-off-by: Peng Fan <peng.fan@xxxxxxx>
>> ---
>>
>> V2:
>> Clear table_sz when rproc_fw_boot and rproc_detach per Arnaud
>>
>> drivers/remoteproc/remoteproc_core.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>> index c2cf0d277729..1efa53d4e0c3 100644
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -1442,6 +1442,7 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw)
>> kfree(rproc->cached_table);
>> rproc->cached_table = NULL;
>> rproc->table_ptr = NULL;
>> + rproc->table_sz = 0;
>> unprepare_rproc:
>> /* release HW resources if needed */
>> rproc_unprepare_device(rproc);
>> @@ -2025,6 +2026,7 @@ int rproc_shutdown(struct rproc *rproc)
>> kfree(rproc->cached_table);
>> rproc->cached_table = NULL;
>> rproc->table_ptr = NULL;
>> + rproc->table_sz = 0;
>> out:
>> mutex_unlock(&rproc->lock);
>> return ret;
>> @@ -2091,6 +2093,7 @@ int rproc_detach(struct rproc *rproc)
>> kfree(rproc->cached_table);
>> rproc->cached_table = NULL;
>> rproc->table_ptr = NULL;
>> + rproc->table_sz = 0;
>> out:
>> mutex_unlock(&rproc->lock);
>> return ret;
>> --
>> 2.37.1
>>
>