Re: [PATCH] Sanitize possible_parent_show to Handle Return Value of of_clk_get_parent_name

From: Alessandro Carminati
Date: Thu Sep 14 2023 - 09:25:44 EST


Alessandro Carminati
Mobile: +39.339.8870183


Il giorno mar 12 set 2023 alle ore 19:59 Stephen Boyd
<sboyd@xxxxxxxxxx> ha scritto:
>
> Quoting Alessandro Carminati (2023-09-12 10:05:19)
> > Il giorno ven 8 set 2023 alle ore 23:25 Stephen Boyd
> > <sboyd@xxxxxxxxxx> ha scritto:
> > >
> > > Is nothing actually wrong? And this is a speculative patch?
> >
> > In the current state, Linux can crash, so I would say that there's something
> > wrong.
> > I submitted this patch in response to a specific bug report that caused a
> > kernel crash during testing.
>
> Where is the bug report? On some public tracker? Can you link to it?
Sorry for the wait, I used this time to verify internally if the test can
be shared.
Unfortunately, we can't share it on a public tracker or the test outside
our company.

But I can definitely provide a simple reproducer.
It's a breeze once you've got the right hardware, like the Qualcomm
SA8540P-based board where I found it.
Here it is:

# cat /sys/kernel/debug/clk/gcc_usb4_phy_sys_clk_src/clk_possible_parents
[ 139.322764] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[ 139.331802] Mem abort info:
[ 139.334679] ESR = 0x0000000096000004
[ 139.338536] EC = 0x25: DABT (current EL), IL = 32 bits
[ 139.343991] SET = 0, FnV = 0
[ 139.347131] EA = 0, S1PTW = 0
[ 139.350356] FSC = 0x04: level 0 translation fault
[ 139.355360] Data abort info:
[ 139.358326] ISV = 0, ISS = 0x00000004
[ 139.362270] CM = 0, WnR = 0
[ 139.365322] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000109abb000
[ 139.371931] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 139.378909] Internal error: Oops: 0000000096000004 [#1] PREEMPT_RT SMP
[ 139.378915] Modules linked in: rfkill dwc3 pinctrl_spmi_gpio
rtc_pm8xxx qcom_spmi_pmic marvell regmap_spmi udc_core
dwmac_qcom_ethqos socinfo stmmac_platform stmmac qrtr crct10dif_ce
ghash_ce sha2_ce qcom_q6v5_pas sha256_arm64 sha1_ce qcom_pil_info
spmi_pmic_arb mdt_loader qcom_common spmi qcom_glink_smem
phy_qcom_qmp_usb dwc3_qcom qcom_glink llcc_qcom pcs_xpcs qcom_smd
phy_qcom_qmp_combo phylink qcom_q6v5 phy_qcom_snps_femto_v2
qcom_sysmon rpmsg_core qcom_hwspinlock qmi_helpers qcom_cpufreq_hw
qcom_aoss qcom_wdt qcom_rng smp2p smem sg fuse ext4 mbcache jbd2
ufs_qcom ufshcd_pltfrm ufshcd_core qcom_geni_serial rpmhpd
qcom_geni_se qcom_rpmh_regulator pinctrl_sc8280xp pinctrl_sa8775p
pinctrl_msm phy_qcom_qmp_ufs qnoc_sc8280xp qnoc_sa8775p icc_rpmh
icc_bcm_voter governor_simpleondemand gcc_sc8280xp gcc_sa8775p
clk_rpmh qcom_rpmh cmd_db clk_qcom
[ 139.379012] CPU: 7 PID: 623 Comm: cat Not tainted 5.14.0+ #149
[ 139.379017] Hardware name: Qualcomm SA8540P Ride (DT)
[ 139.379019] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 139.379024] pc : __pi_strlen+0x14/0x150
[ 139.379037] lr : seq_puts+0x28/0x70
[ 139.379047] sp : ffff800009bbba90
[ 139.379048] x29: ffff800009bbba90 x28: ffff10604726fdd0 x27: 0000000000400cc0
[ 139.379054] x26: 000000007ffff000 x25: ffff10604726fdc0 x24: 0000000000000000
[ 139.379058] x23: ffff800009bbbbd0 x22: 0000000000000020 x21: ffff10604076a500
[ 139.379063] x20: 0000000000000000 x19: ffff10604726fd98 x18: 0000000000000000
[ 139.379067] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 139.379071] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 139.379076] x11: 0000000000000000 x10: ffffa59b267f6f1b x9 : ffffa19b2397332c
[ 139.379080] x8 : 0101010101010101 x7 : 00000000736c6c65 x6 : 0080e3ecef636b2d
[ 139.379085] x5 : 2d6b636f6c630000 x4 : 0000000000000000 x3 : 0000000000000000
[ 139.379090] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
[ 139.379094] Call trace:
[ 139.379095] __pi_strlen+0x14/0x150
[ 139.379101] possible_parent_show+0x7c/0x100
[ 139.379108] possible_parents_show+0x4c/0x90
[ 139.379114] seq_read_iter+0xdc/0x480
[ 139.379120] seq_read+0xc0/0x100
[ 139.379127] full_proxy_read+0x64/0xb0
[ 139.379135] vfs_read+0xb4/0x1d0
[ 139.379142] ksys_read+0x70/0x100
[ 139.379148] __arm64_sys_read+0x20/0x30
[ 139.379154] invoke_syscall.constprop.0+0x7c/0xd0
[ 139.379164] el0_svc_common.constprop.0+0x124/0x134
[ 139.379170] do_el0_svc+0x2c/0xac
[ 139.379176] el0_svc+0x38/0x1c0
[ 139.379184] el0t_64_sync_handler+0x10c/0x11c
[ 139.379187] el0t_64_sync+0x17c/0x180
[ 139.379192] Code: 92402c04 b200c3e8 f13fc09f 5400088c (a9400c02)
[ 139.631968] ---[ end trace f38c177abbfb4315 ]---
[ 139.631971] Kernel panic - not syncing: Oops: Fatal exception
[ 139.642601] SMP: stopping secondary CPUs
[ 139.643039] Kernel Offset: 0x219b1b280000 from 0xffff800008000000
[ 139.643040] PHYS_OFFSET: 0xffffefa0c0000000
[ 139.643040] CPU features: 0x0000000,05c01018,c801720b
[ 139.643042] Memory Limit: none

>
> >
> > >
> > > All other arms of this conditional statement check the validity of the
> > > pointer before printing the string. And when the parent isn't known we
> > > print "(missing)", so it looks like we should do that instead. How about
> > > this patch?
> >
> > Indeed, your patch appears to provide a more verbose coding and accurate
> > message, which is valuable for troubleshooting.
> > So yeah, for what it is worth, I think it is good as well.
>
> Cool, can you send it after testing it out?
I will thank you for your assistance.