Re: [RFC net-next 0/8] Introducing subdev bus and devlink extension

From: Yunsheng Lin
Date: Thu Jun 10 2021 - 03:04:37 EST


On 2021/6/9 21:45, Parav Pandit wrote:
>> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>> Sent: Wednesday, June 9, 2021 6:00 PM
>>
>> On 2021/6/9 19:59, Parav Pandit wrote:
>>>> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>>>> Sent: Wednesday, June 9, 2021 4:35 PM
>>>>
>>>> On 2021/6/9 17:38, Parav Pandit wrote:
>>>>>
>>>>>> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
>>>>>> Sent: Wednesday, June 9, 2021 2:46 PM
>>>>>>
>>>>> [..]
>>>>>
>>>>>>>> Is there any reason why VF use its own devlink instance?
>>>>>>>
>>>>>>> Primary use case for VFs is virtual environments where guest isn't
>>>>>>> trusted, so tying the VF to the main devlink instance, over which
>>>>>>> guest should have no control is counter productive.
>>>>>>
>>>>>> The security is mainly about VF using in container case, right?
>>>>>> Because VF using in VM, it is different host, it means a different
>>>>>> devlink instance for VF, so there is no security issue for VF using
>>>>>> in VM
>>>> case?
>>>>>> But it might not be the case for VF using in container?
>>>>> Devlink instance has net namespace attached to it controlled using
>>>>> devlink
>>>> reload command.
>>>>> So a VF devlink instance can be assigned to a container/process
>>>>> running in a
>>>> specific net namespace.
>>>>>
>>>>> $ ip netns add n1
>>>>> $ devlink dev reload pci/0000:06:00.4 netns n1
>>>>> ^^^^^^^^^^^^^
>>>>> PCI VF/PF/SF.
>>>>
>>>> Could we create another devlink instance when the net namespace of
>>>> devlink port instance is changed?
>>> Net namespace of (a) netdevice (b) rdma device (c) devlink instance can be
>> changed.
>>> Net namespace of devlink port cannot be changed.
>>
>> Yes, net namespace is changed based on the devlink instance, not devlink
>> port instance, *right now*.
>>
>>>
>>>> It may seems we need to change the net namespace based on devlink
>>>> port instance instead of devlink instance.
>>>> This way container case seems be similiar to the VM case?
>>> I mostly do not understand the topology you have in mind or if you
>> explained previously I missed the thread.
>>> In your case what is the flavour of a devlink port?
>>
>> flavour of the devlink port instance is FLAVOUR_PHYSICAL or
>> FLAVOUR_VIRTUAL.
>>
>> The reason I suggest to change the net namespace on devlink port instance
>> instead of devlink instance is:
>> I proposed that all the PF and VF in the same ASIC are registered to the same
>> devlink instance as flavour FLAVOUR_PHYSICAL or FLAVOUR_VIRTUAL when
>> there are in the same host and in the same net namespace.
>>
>> If a VF's devlink port instance is unregistered from old devlink instance in the
>> old net namespace and registered to new devlink instance in the new net
>> namespace(create a new devlink instance if
>> needed) when devlink port instance's net namespace is changed, then the
>> security mentioned by jakub is not a issue any more?
>
> It seems that devlink instance of VF is not needed in your case, and if so what is the motivation to even have VIRTUAL port attach to the PF?

The devlink instance is mainly used to hold the devlink port instance
of VF if there is only one VF in vm, we might still need to have
param/health specific to the VF to registered to the devlink port
instance of that VF.

> If only netdevice of the VF is of interest, it can be assigned to net namespace directly.

I think that is another option, if there is nothing in the devlink port
instance specific to VF that need exposing to the user in another net
namespace.

>
> It doesn’t make sense to me to create new devlink instance in new net namespace, that also needs to be deleted when net ns is deleted.
> And pre_exit() routine will mostly deadlock holding global devlink_mutex.

Would you be more specific why there is deadlock?
It seems more of implementation detail, which we can discuss later
when we are agreed it is the right way to go down deeper?

>