Re: [PATCH v4] IB/hfi1: allocate dummy net_device dynamically

From: Breno Leitao
Date: Wed Apr 03 2024 - 08:18:21 EST


On Mon, Apr 01, 2024 at 11:34:23AM -0400, Dennis Dalessandro wrote:
> On 4/1/24 10:53 AM, Jakub Kicinski wrote:
> > On Mon, 1 Apr 2024 14:53:31 +0300 Leon Romanovsky wrote:
> >> On Tue, Mar 19, 2024 at 02:09:43AM -0700, Breno Leitao wrote:
> >>> Embedding net_device into structures prohibits the usage of flexible
> >>> arrays in the net_device structure. For more details, see the discussion
> >>> at [1].
> >>>
> >>> Un-embed the net_device from struct hfi1_netdev_rx by converting it
> >>> into a pointer. Then use the leverage alloc_netdev() to allocate the
> >>> net_device object at hfi1_alloc_rx().
> >>>
> >>> [1] https://lore.kernel.org/all/20240229225910.79e224cf@xxxxxxxxxx/
> >>>
> >>> Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> >>> Acked-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxxxxxxxxxxxxx>
> >>
> >> Jakub,
> >>
> >> I create shared branch for you, please pull it from:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=remove-dummy-netdev
> >
> > Did you merge it in already?
> > Turned out that the use of init_dummy_netdev as a setup function
> > is broken, I'm not sure how Dennis tested this :(
> > We should have pinged you, sorry.
>
> This is what I tested, Linus 6.8 tag + cherry pick + Breno patch. So if
> something went in that broke it I didn't have it in my tree.
>
> commit 311810a6d7e37d8e7537d50e26197b7f5f02f164 (linus-master)
> Author: Breno Leitao <leitao@xxxxxxxxxx>
> Date: Wed Mar 13 03:33:10 2024 -0700
>
> IB/hfi1: allocate dummy net_device dynamically

This one has a potential bug that causes a kernel panic when the module
is removed.

This is because alloc_netdev() allocates some data structures that are
later overwritten (memset) by init_dummy_netdev(). At the free time,
free_netdev() will dereference those structures and they are zero.

A new upcoming patch is creating a helper (init_dummy_netdev()) that
will allocate the netdev and call a special version of
init_dummy_netdev() without memsetting the structure.

I would drop this patch for now, and I will submit a new version using
the new helper.