Re: [PATCH v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy
From: David Oberhollenzer
Date: Tue Apr 01 2025 - 04:13:21 EST
Hi,
I did some further re-testing on the fix, regarding the the similar race
in remove() as well as the previous question regarding the locking and
cancellation order. V3 already expands on this, and the point still stands,
the nested timer+queue+trylock mechanism is somewhat tricky and I manage
to hit the race window with just cancel_work_sync(), without the lock or
a different order for tear down.
On 1/15/25 12:27 AM, Jakub Kicinski wrote:
On Mon, 13 Jan 2025 09:49:12 +0100 David Oberhollenzer wrote:
@@ -7323,6 +7323,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
mv88e6xxx_g1_irq_free(chip);
else
mv88e6xxx_irq_poll_free(chip);
+out_phy:
+ mv88e6xxx_phy_destroy(chip);
out:
if (pdata)
dev_put(pdata->netdev);
If this is the right ordering the order in mv88e6xxx_remove()
looks suspicious. We call mv88e6xxx_phy_destroy() pretty early
and then unregister from DSA. Isn't there a window where DSA
callbacks can reschedule the timer?
yes, this does looks suspicious, mv88e6xxx_phy_destroy() should be done
after the switch is unregistered, otherwise it should logically cause
the same issue.
However, I did not manage to trigger this during testing, and this also
did not fix the original issue I saw, but I will fix the order in a
followup v4 patch.
Greetings,
David