Re: [PATCH net-next] net: don't relock netdev when on qdisc_create replay
From: Stanislav Fomichev
Date: Wed Mar 19 2025 - 14:55:50 EST
On 03/18, Simon Horman wrote:
> On Thu, Mar 13, 2025 at 03:04:07AM -0700, Stanislav Fomichev wrote:
> > Eric reports that by the time we call netdev_lock_ops after
> > rtnl_unlock/rtnl_lock, the dev might point to an invalid device.
> > Don't relock the device after request_module and don't try
> > to unlock it in the caller (tc_modify_qdisc) in case of replay.
> >
> > Fixes: a0527ee2df3f ("net: hold netdev instance lock during qdisc ndo_setup_tc")
> > Reported-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> > Link: https://lore.kernel.org/netdev/20250305163732.2766420-1-sdf@xxxxxxxxxxx/T/#me8dfd778ea4c4463acab55644e3f9836bc608771
> > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxxx>
> > ---
> > net/sched/sch_api.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> > index abace7665cfe..f1ec6ec0cf05 100644
> > --- a/net/sched/sch_api.c
> > +++ b/net/sched/sch_api.c
> > @@ -1278,13 +1278,14 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
> > * tell the caller to replay the request. We
> > * indicate this using -EAGAIN.
> > * We replay the request because the device may
> > - * go away in the mean time.
> > + * go away in the mean time. Note that we also
> > + * don't relock the device because it might
> > + * be gone at this point.
> > */
> > netdev_unlock_ops(dev);
> > rtnl_unlock();
> > request_module(NET_SCH_ALIAS_PREFIX "%s", name);
> > rtnl_lock();
> > - netdev_lock_ops(dev);
> > ops = qdisc_lookup_ops(kind);
> > if (ops != NULL) {
>
> Hi Stan,
>
> I see that if this condition is met then the replay logic
> in the next hunk works as intended by this patch.
>
> But what if this condition is not met?
> It seems to me that qdisc_create(), and thus __tc_modify_qdisc()
> will return with an unlocked device, but the replay logic
> won't take effect in tc_modify_qdisc().
>
> Am I missing something?
Oh, yes, thanks for catching this. Let me think on how to handle the
-ENOENT as well..
---
pw-bot: cr