Re: [Patch net-next v2 0/9] net: dsa: microchip: add support for phylink mac config and link up

From: Vladimir Oltean
Date: Tue Aug 30 2022 - 06:01:03 EST


Hello,

On Tue, Aug 30, 2022 at 08:15:59AM +0000, Arun.Ramadoss@xxxxxxxxxxxxx wrote:
> On Tue, 2022-08-30 at 08:55 +0200, Oleksij Rempel wrote:
> > Hi Arun,
> >
> > starting with this patch set I have following regression on ksz8873
> > switch. Can you please take a look at it:
> > 8<--- cut here ---
> > Unable to handle kernel NULL pointer dereference at virtual address 00000005
> > ksz8863-switch gpio-0:00: nonfatal error -34 setting MTU to 1500 on port 0
> > ...
> > Modules linked in:
> > CPU: 0 PID: 16 Comm: kworker/0:1 Not tainted 6.0.0-rc2-00436-
> > g3da285df1324 #74
> > Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > Workqueue: events_power_efficient phylink_resolve
> > PC is at ksz_set_gbit+0x5c/0xa4
> > LR is at arch_atomic_cmpxchg_relaxed+0x1c/0x38
> > ....
> > Backtrace:
> > ksz_set_gbit from ksz_phylink_mac_link_up+0x15c/0x1c8
> > ksz_phylink_mac_link_up from dsa_port_phylink_mac_link_up+0x7c/0x80
> > dsa_port_phylink_mac_link_up from phylink_resolve+0x304/0x3d0
> > phylink_resolve from process_one_work+0x214/0x31c
> > process_one_work from worker_thread+0x254/0x2d4
> > worker_thread from kthread+0xfc/0x108
> > kthread from ret_from_fork+0x14/0x2c
> > ...
> > ksz8863-switch gpio-0:00 lan2 (uninitialized): PHY [dsa-0.0:01] driver [Micrel KSZ8851 Ethernet MAC or KSZ886X Switch] (irq=POLL)
> > ksz8863-switch gpio-0:00: nonfatal error -34 setting MTU to 1500 on port 1
> > device eth0 entered promiscuous mode
> > DSA: tree 0 setup
> > ---[ end trace 0000000000000000 ]---
>
> Hi Oleksij,
> Is this Bug related to fix in
> https://lore.kernel.org/lkml/20220829105810.577903823@xxxxxxxxxxxxxxxxxxx/
> .
> It is observed in ksz8794 switch. I think after applying this bug fix
> patch it should work. I don't have ksz8 series to test. I ran the
> regression only for ksz9 series switches.

I find it unlikely that the cited patch will fix a NULL pointer
dereference in ksz_get_gbit(). But rather, some pointer to a structure
is NULL, and we then dereference a member located at its offset 0x5, no?

My eyes are on this:

const u8 *bitval = dev->info->xmii_ctrl1;

data8 |= FIELD_PREP(P_GMII_1GBIT_M, bitval[P_GMII_NOT_1GBIT]);
~~~~~~~~~~~~~~~~
this is coincidentally
also 5

See, looking at the struct ksz_chip_data[] array element for KSZ8873
that Oleksij mentions as broken, I do not see xmii_ctrl1 and xmii_ctrl2
as being pointers to anything.

[KSZ8830] = {
.chip_id = KSZ8830_CHIP_ID,
.dev_name = "KSZ8863/KSZ8873",
.num_vlans = 16,
.num_alus = 0,
.num_statics = 8,
.cpu_ports = 0x4, /* can be configured as cpu port */
.port_cnt = 3,
.ops = &ksz8_dev_ops,
.mib_names = ksz88xx_mib_names,
.mib_cnt = ARRAY_SIZE(ksz88xx_mib_names),
.reg_mib_cnt = MIB_COUNTER_NUM,
.regs = ksz8863_regs,
.masks = ksz8863_masks,
.shifts = ksz8863_shifts,
.supports_mii = {false, false, true},
.supports_rmii = {false, false, true},
.internal_phy = {true, true, false},
},

Should we point them to ksz8795_xmii_ctrl0 and ksz8795_xmii_ctrl1? I don't know.
Could you find out what these should be set to?