Re: [PATCH] pmdomain: arm: Fix NULL dereference on scmi_perf_domain removal

From: Ulf Hansson
Date: Tue Jan 30 2024 - 08:13:30 EST


On Thu, 25 Jan 2024 at 20:18, Cristian Marussi <cristian.marussi@xxxxxxx> wrote:
>
> On unloading of the scmi_perf_domain module got the below splat, when in
> the DT provided to the system under test the '#power-domain-cells' property
> was missing.
> Indeed, this particular setup causes the probe to bail out early without
> giving any error, so that, then, the removal code is run on unload, but
> without all the expected initialized structures in place.
>
> Add a check and bail out early on remove too.

Thanks for spotting this!

>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> Mem abort info:
> ESR = 0x0000000096000004
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x04: level 0 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=00000001076e5000
> [0000000000000008] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
> Modules linked in: scmi_perf_domain(-) scmi_module scmi_core
> CPU: 0 PID: 231 Comm: rmmod Not tainted 6.7.0-00084-gb4b1f27d3b83-dirty #15
> Hardware name: linux,dummy-virt (DT)
> pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> pc : scmi_perf_domain_remove+0x28/0x70 [scmi_perf_domain]
> lr : scmi_perf_domain_remove+0x28/0x70 [scmi_perf_domain]
> sp : ffff80008393bc10
> x29: ffff80008393bc10 x28: ffff0000875a8000 x27: 0000000000000000
> x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> x23: ffff00008030c090 x22: ffff00008032d490 x21: ffff80007b287050
> x20: 0000000000000000 x19: ffff00008032d410 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 8ba0696d05013a2f x13: 0000000000000000 x12: 0000000000000002
> x11: 0101010101010101 x10: ffff00008510cff8 x9 : ffff800080a6797c
> x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff6364626d
> x5 : 8080808000000000 x4 : 0000000000000020 x3 : 00000000553a3dc1
> x2 : ffff0000875a8000 x1 : ffff0000875a8000 x0 : ffff800082ffa048
> Call trace:
> scmi_perf_domain_remove+0x28/0x70 [scmi_perf_domain]
> scmi_dev_remove+0x28/0x40 [scmi_core]
> device_remove+0x54/0x90
> device_release_driver_internal+0x1dc/0x240
> driver_detach+0x58/0xa8
> bus_remove_driver+0x78/0x108
> driver_unregister+0x38/0x70
> scmi_driver_unregister+0x28/0x180 [scmi_core]
> scmi_perf_domain_driver_exit+0x18/0xb78 [scmi_perf_domain]
> __arm64_sys_delete_module+0x1a8/0x2c0
> invoke_syscall+0x50/0x128
> el0_svc_common.constprop.0+0x48/0xf0
> do_el0_svc+0x24/0x38
> el0_svc+0x34/0xb8
> el0t_64_sync_handler+0x100/0x130
> el0t_64_sync+0x190/0x198
> Code: a90153f3 f9403c14 f9414800 955f8a05 (b9400a80)
> ---[ end trace 0000000000000000 ]---
>
> Cc: Sudeep Holla <sudeep.holla@xxxxxxx>
> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> Fixes: 2af23ceb8624 ("pmdomain: arm: Add the SCMI performance domain")
> Signed-off-by: Cristian Marussi <cristian.marussi@xxxxxxx>
> ---
> I suppose the probe does NOT bail out with an error because this DT config has
> to be supported, right ?

Actually, no. It's a mistake by me, the probe should bail out with an
error code.

In fact, there is also one additional similar problem in probe, when
the number of perf-domains are zero. In that case, we should also
return an error code, rather than returning 0.

Would you mind updating the patch to cover both problems - or if you
are too busy, just let me know and I can help out.

> ---
> drivers/pmdomain/arm/scmi_perf_domain.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pmdomain/arm/scmi_perf_domain.c b/drivers/pmdomain/arm/scmi_perf_domain.c
> index 709bbc448fad..d7ef46ccd9b8 100644
> --- a/drivers/pmdomain/arm/scmi_perf_domain.c
> +++ b/drivers/pmdomain/arm/scmi_perf_domain.c
> @@ -159,6 +159,9 @@ static void scmi_perf_domain_remove(struct scmi_device *sdev)
> struct genpd_onecell_data *scmi_pd_data = dev_get_drvdata(dev);
> int i;
>
> + if (!scmi_pd_data)
> + return;
> +
> of_genpd_del_provider(dev->of_node);
>
> for (i = 0; i < scmi_pd_data->num_domains; i++)
> --
> 2.43.0
>

Kind regards
Uffe