Re: [PATCH net-next] net: phylink: guard link replay helpers against NULL phylink instance

From: Paolo Abeni

Date: Tue Feb 17 2026 - 03:22:57 EST


On 2/5/26 8:23 PM, Vladimir Oltean wrote:
> There is a crash when unbinding the sja1105 driver under special
> circumstances:
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030
> Call trace:
> phylink_run_resolve_and_disable+0x10/0x90
> sja1105_static_config_reload+0xc0/0x410
> sja1105_vlan_filtering+0x100/0x140
> dsa_port_vlan_filtering+0x13c/0x368
> dsa_port_reset_vlan_filtering.isra.0+0xe8/0x198
> dsa_port_bridge_leave+0x130/0x248
> dsa_user_changeupper.part.0+0x74/0x158
> dsa_user_netdevice_event+0x50c/0xa50
> notifier_call_chain+0x78/0x148
> raw_notifier_call_chain+0x20/0x38
> call_netdevice_notifiers_info+0x58/0xa8
> __netdev_upper_dev_unlink+0xac/0x220
> netdev_upper_dev_unlink+0x38/0x70
> del_nbp+0x1a4/0x320
> br_del_if+0x3c/0xd8
> br_device_event+0xf8/0x2d8
> notifier_call_chain+0x78/0x148
> raw_notifier_call_chain+0x20/0x38
> call_netdevice_notifiers_info+0x58/0xa8
> unregister_netdevice_many_notify+0x314/0x848
> unregister_netdevice_queue+0xe8/0xf8
> dsa_user_destroy+0x50/0xa8
> dsa_port_teardown+0x80/0x98
> dsa_switch_teardown_ports+0x4c/0xb8
> dsa_switch_deinit+0x94/0xb8
> dsa_switch_put_tree+0x2c/0xc0
> dsa_unregister_switch+0x38/0x60
> sja1105_remove+0x24/0x40
> spi_remove+0x38/0x60
> device_remove+0x54/0x90
> device_release_driver_internal+0x1d4/0x230
> device_driver_detach+0x20/0x38
> unbind_store+0xbc/0xc8
> ---[ end trace 0000000000000000 ]---
>
> which requires an explanation.
>
> When a port offloads a bridge, the switch must be reset to change
> the VLAN awareness state (the SJA1105_VLAN_FILTERING reason for
> sja1105_static_config_reload()). When the port leaves a VLAN-aware
> bridge, it must also be reset for the same reason: it is returning
> to operation as a VLAN-unaware standalone port.
>
> sja1105_static_config_reload() triggers the phylink link replay helpers.
>
> Because sja1105 is a switch, it has multiple user ports. During unbind,
> ports are torn down one by one in dsa_switch_teardown_ports() ->
> dsa_port_teardown() -> dsa_user_destroy().
>
> The crash happens when the numerically first user port is not part of
> the VLAN-aware bridge, but any other user port is.
>
> Tearing down the first user port causes phylink_destroy() to be called
> on dp->pl, and this pointer to be set to NULL. Then, when the second
> user port is torn down, this was offloading a VLAN-aware bridge port, so
> indirectly it will trigger sja1105_static_config_reload().
>
> The latter function iterates using dsa_switch_for_each_available_port(),
> and unconditionally dereferences dp->pl, including for the
> aforementioned torn down previous port, and passes that to phylink.
> This is where the NULL pointer is coming from.
>
> There are multiple levels at which this could be avoided:
> - add an "if (dp->pl)" in sja1105_static_config_reload()
> - make the phylink replay helpers NULL-tolerant
> - mark ports as DSA_PORT_TYPE_UNUSED after dsa_port_phylink_destroy()
> has run, such that subsequent dsa_switch_for_each_available_port()
> iterations skip them
> - disconnect the entire switch at once from switchdev and
> NETDEV_CHANGEUPPER events while unbinding, not just port by port,
> likely using a "ds->unbinding = true" mechanism or similar
>
> however options 3 and 4 are quite heavy and might have side effects,
> option 1 is very unassuming and option 2 seems a more elegant variant
> of 1, given the fact that sja1105 is the only user of these phylink
> replay helpers. It allows to keep the driver simple and is the option
> I went with.
>
> Functionally speaking, transforming the replay helpers into no-ops for
> ports without a phylink instance is fine, because that only happens
> during driver removal (an operation which cannot be cancelled). The
> ports are not required to work.
>
> Fixes: 0b2edc531e0b ("net: dsa: sja1105: let phylink help with the replay of link callbacks")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx>

I think this patch could land on current net, but it would be nice an
ack from phylib SMEs.

Thanks,

Paolo