RE: [EXT] Re: [PATCH net-next v5 1/8] octeon_ep_vf: Add driver framework and device initialization

From: Shinas Rasheed
Date: Sat Feb 03 2024 - 00:35:57 EST


Hi Jakub,

> -----Original Message-----
> From: Jakub Kicinski <kuba@xxxxxxxxxx>
> Sent: Thursday, February 1, 2024 5:44 AM
> To: Shinas Rasheed <srasheed@xxxxxxxxxxx>
> Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Haseeb Gani
> <hgani@xxxxxxxxxxx>; Vimlesh Kumar <vimleshk@xxxxxxxxxxx>; Sathesh B
> Edara <sedara@xxxxxxxxxxx>; egallen@xxxxxxxxxx; mschmidt@xxxxxxxxxx;
> pabeni@xxxxxxxxxx; horms@xxxxxxxxxx; wizhao@xxxxxxxxxx;
> kheib@xxxxxxxxxx; konguyen@xxxxxxxxxx; David S. Miller
> <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Jonathan
> Corbet <corbet@xxxxxxx>; Veerasenareddy Burru <vburru@xxxxxxxxxxx>;
> Satananda Burla <sburla@xxxxxxxxxxx>; Shannon Nelson
> <shannon.nelson@xxxxxxx>; Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>;
> Joshua Hay <joshua.a.hay@xxxxxxxxx>; Rahul Rameshbabu
> <rrameshbabu@xxxxxxxxxx>; Brett Creeley <brett.creeley@xxxxxxx>; Andrew
> Lunn <andrew@xxxxxxx>; Jacob Keller <jacob.e.keller@xxxxxxxxx>
> Subject: [EXT] Re: [PATCH net-next v5 1/8] octeon_ep_vf: Add driver framework
> and device initialization
>
> External Email
>
> ----------------------------------------------------------------------
> On Sun, 28 Jan 2024 21:02:47 -0800 Shinas Rasheed wrote:
>
> > + netif_carrier_off(netdev);
> > + netif_tx_disable(netdev);
>
> You haven't masked any IRQ or disabled NAPI. What prevents the queues
> from getting restarted right after this call?

The napi functionality (along with disabling it when stopping), is introduced (and used) in the patch after this one [2/8]. Also we disable interrupts in the
disable_interrupt hook which is also called in the next patch [2/8].

> > +static void octep_vf_tx_timeout(struct net_device *netdev, unsigned int
> txqueue)
> > +{
> > + struct octep_vf_device *oct = netdev_priv(netdev);
> > +
> > + queue_work(octep_vf_wq, &oct->tx_timeout_task);
> > +}
>
> I don't see you canceling this work. What if someone unregistered
> the device before it runs? You gotta netdev_hold() a reference.

We do cancel_work_sync in octep_vf_remove function.

> > +static int __init octep_vf_init_module(void)
> > +{
> > + int ret;
> > +
> > + pr_info("%s: Loading %s ...\n", OCTEP_VF_DRV_NAME,
> OCTEP_VF_DRV_STRING);
> > +
> > + /* work queue for all deferred tasks */
> > + octep_vf_wq =
> create_singlethread_workqueue(OCTEP_VF_DRV_NAME);
>
> Is there a reason this wq has to be single threaded and different than
> system queue? All you schedule on it in this series is the reset task.

We also schedule the control mailbox task on this workqueue. The workqueue was created with the intention
that there could be other driver specific tasks to add in the future. It has been single threaded
for now, but we might optimize implementation in the future, although for now as far as to service our control plane this has been enough.

Let me know if there are other changes to be made (I ack the other ones you have made), so I can gather up everything and submit another patch. Thanks for your review!