Re: [PATCH iwl-net] ice: fix infinite recursion in ice_cfg_tx_topo via ice_init_dev_hw
From: Przemek Kitszel
Date: Thu Apr 16 2026 - 00:37:22 EST
On 4/15/26 23:22, Jacob Keller wrote:
On 4/15/2026 9:30 AM, Simon Horman wrote:
On Mon, Apr 13, 2026 at 09:14:20PM +0200, Petr Oros wrote:
On certain E810 configurations where firmware supports Tx scheduler
topology switching (tx_sched_topo_comp_mode_en), ice_cfg_tx_topo()
may need to apply a new 5-layer or 9-layer topology from the DDP
package. If the AQ command to set the topology fails (e.g. due to
invalid DDP data or firmware limitations), the global configuration
lock must still be cleared via a CORER reset.
Commit 86aae43f21cf ("ice: don't leave device non-functional if Tx
scheduler config fails") correctly fixed this by refactoring
ice_cfg_tx_topo() to always trigger CORER after acquiring the global
lock and re-initialize hardware via ice_init_hw() afterwards.
However, commit 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end
of deinit paths") later moved ice_init_dev_hw() into ice_init_hw(),
breaking the reinit path introduced by 86aae43f21cf. This creates an
infinite recursive call chain:
ice_init_hw()
ice_init_dev_hw()
ice_cfg_tx_topo() # topology change needed
ice_deinit_hw()
ice_init_hw() # reinit after CORER
ice_init_dev_hw() # recurse
ice_cfg_tx_topo()
... # stack overflow
Fix by moving ice_init_dev_hw() back out of ice_init_hw() and calling
it explicitly from ice_probe() and ice_devlink_reinit_up(). The third
caller, ice_cfg_tx_topo(), intentionally does not need ice_init_dev_hw()
ice_cfg_tx_topo() stops calling ice_init_dev_hw(), that is the real
change that patch does, OK
during its reinit, it only needs the core HW reinitialization. This
breaks the recursion cleanly without adding flags or guards.
The deinit ordering changes from commit 8a37f9e2ff40 ("ice: move
ice_deinit_dev() to the end of deinit paths") which fixed slow rmmod
are preserved, only the init-side placement of ice_init_dev_hw() is
reverted.
Fixes: 8a37f9e2ff40 ("ice: move ice_deinit_dev() to the end of deinit paths")
Signed-off-by: Petr Oros <poros@xxxxxxxxxx>
Hi Petr,
I don't intended to delay this patch.
But could you follow-up by looking over the AI generated
review of this patch on sashiko.dev?
Thanks!
I'll take a look as well. I recently included this fix in Intel Wired
LAN update last night, so hopefully nothing too problematic...
Sashiko says:
While this code wasn't introduced by this patch, the restructuring makes it
more visible: can this cause a use-after-free if the nested hardware
initialization fails?
If ice_cfg_tx_topo() triggers a topology change, it performs a CORER reset
followed by an unroll (ice_deinit_hw) and re-initialization (ice_init_hw). If
that nested ice_init_hw() fails, its unroll path frees hw->port_info and
destroys control queues and mutexes.
here is a talk about "prerequisite for the problem"
Because ice_init_dev_hw() returns void, it swallows the -ENODEV error and
and here is about code that Petr just removes, IOW, does not apply
Plausible sounding comments, yeah, I hope we will not drown in the sea
of AI content :(
for the patch:
I have tested that it does not break my test suite (it was me to start
touching ice_init_hw() and friends), and both code and human written
commit message looks good,
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx>
thank you for fixing the code after me!