RE: [RFC net-next 0/8] Introducing subdev bus and devlink extension

From: Parav Pandit
Date: Thu Jun 10 2021 - 03:17:19 EST




> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> Sent: Thursday, June 10, 2021 12:34 PM
>
> On 2021/6/9 21:45, Parav Pandit wrote:
> >> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> >> Sent: Wednesday, June 9, 2021 6:00 PM
> >>
> >> On 2021/6/9 19:59, Parav Pandit wrote:
> >>>> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> >>>> Sent: Wednesday, June 9, 2021 4:35 PM
> >>>>
> >>>> On 2021/6/9 17:38, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> >>>>>> Sent: Wednesday, June 9, 2021 2:46 PM
> >>>>>>
> >>>>> [..]
> >>>>>
> >>>>>>>> Is there any reason why VF use its own devlink instance?
> >>>>>>>
> >>>>>>> Primary use case for VFs is virtual environments where guest
> >>>>>>> isn't trusted, so tying the VF to the main devlink instance,
> >>>>>>> over which guest should have no control is counter productive.
> >>>>>>
> >>>>>> The security is mainly about VF using in container case, right?
> >>>>>> Because VF using in VM, it is different host, it means a
> >>>>>> different devlink instance for VF, so there is no security issue
> >>>>>> for VF using in VM
> >>>> case?
> >>>>>> But it might not be the case for VF using in container?
> >>>>> Devlink instance has net namespace attached to it controlled using
> >>>>> devlink
> >>>> reload command.
> >>>>> So a VF devlink instance can be assigned to a container/process
> >>>>> running in a
> >>>> specific net namespace.
> >>>>>
> >>>>> $ ip netns add n1
> >>>>> $ devlink dev reload pci/0000:06:00.4 netns n1
> >>>>> ^^^^^^^^^^^^^
> >>>>> PCI VF/PF/SF.
> >>>>
> >>>> Could we create another devlink instance when the net namespace of
> >>>> devlink port instance is changed?
> >>> Net namespace of (a) netdevice (b) rdma device (c) devlink instance
> >>> can be
> >> changed.
> >>> Net namespace of devlink port cannot be changed.
> >>
> >> Yes, net namespace is changed based on the devlink instance, not
> >> devlink port instance, *right now*.
> >>
> >>>
> >>>> It may seems we need to change the net namespace based on devlink
> >>>> port instance instead of devlink instance.
> >>>> This way container case seems be similiar to the VM case?
> >>> I mostly do not understand the topology you have in mind or if you
> >> explained previously I missed the thread.
> >>> In your case what is the flavour of a devlink port?
> >>
> >> flavour of the devlink port instance is FLAVOUR_PHYSICAL or
> >> FLAVOUR_VIRTUAL.
> >>
> >> The reason I suggest to change the net namespace on devlink port
> >> instance instead of devlink instance is:
> >> I proposed that all the PF and VF in the same ASIC are registered to
> >> the same devlink instance as flavour FLAVOUR_PHYSICAL or
> >> FLAVOUR_VIRTUAL when there are in the same host and in the same net
> namespace.
> >>
> >> If a VF's devlink port instance is unregistered from old devlink
> >> instance in the old net namespace and registered to new devlink
> >> instance in the new net namespace(create a new devlink instance if
> >> needed) when devlink port instance's net namespace is changed, then
> >> the security mentioned by jakub is not a issue any more?
> >
> > It seems that devlink instance of VF is not needed in your case, and if so
> what is the motivation to even have VIRTUAL port attach to the PF?
>
> The devlink instance is mainly used to hold the devlink port instance of VF if
> there is only one VF in vm, we might still need to have param/health specific
> to the VF to registered to the devlink port instance of that VF.
>
This will cover things uniformly with/without container or VM.

> > If only netdevice of the VF is of interest, it can be assigned to net
> namespace directly.
>
> I think that is another option, if there is nothing in the devlink port instance
> specific to VF that need exposing to the user in another net namespace.
>
Yes. no need for devlink instance or devlink port.

> >
> > It doesn’t make sense to me to create new devlink instance in new net
> namespace, that also needs to be deleted when net ns is deleted.
> > And pre_exit() routine will mostly deadlock holding global devlink_mutex.
>
> Would you be more specific why there is deadlock?
Net namespace exit routine cannot invoke a devlink API that demands acquiring devlink global mutex.