RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

From: Parav Pandit
Date: Wed Aug 14 2019 - 09:45:57 EST




> -----Original Message-----
> From: Cornelia Huck <cohuck@xxxxxxxxxx>
> Sent: Wednesday, August 14, 2019 6:39 PM
> To: Parav Pandit <parav@xxxxxxxxxxxx>
> Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; Kirti Wankhede
> <kwankhede@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; cjia@xxxxxxxxxx; Jiri Pirko <jiri@xxxxxxxxxxxx>;
> netdev@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Wed, 14 Aug 2019 12:27:01 +0000
> Parav Pandit <parav@xxxxxxxxxxxx> wrote:
>
> > + Jiri, + netdev
> > To get perspective on the ndo->phys_port_name for the representor netdev
> of mdev.
> >
> > Hi Cornelia,
> >
> > > -----Original Message-----
> > > From: Cornelia Huck <cohuck@xxxxxxxxxx>
> > > Sent: Wednesday, August 14, 2019 1:32 PM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; Kirti Wankhede
> > > <kwankhede@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux-
> > > kernel@xxxxxxxxxxxxxxx; cjia@xxxxxxxxxx
> > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > >
> > > On Wed, 14 Aug 2019 05:54:36 +0000
> > > Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> > >
> > > > > > I get that part. I prefer to remove the UUID itself from the
> > > > > > structure and therefore removing this API makes lot more sense?
> > > > >
> > > > > Mdev and support tools around mdev are based on UUIDs because
> > > > > it's
> > > defined
> > > > > in the documentation.
> > > > When we introduce newer device naming scheme, it will update the
> > > documentation also.
> > > > May be that is the time to move to .rst format too.
> > >
> > > You are aware that there are existing tools that expect a uuid
> > > naming scheme, right?
> > >
> > Yes, Alex mentioned too.
> > The good tool that I am aware of is [1], which is 4 months old. Not sure if it is
> part of any distros yet.
> >
> > README also says, that it is in 'early in development. So we have scope to
> improve it for non UUID names, but lets discuss that more below.
>
> The up-to-date reference for mdevctl is
> https://github.com/mdevctl/mdevctl. There is currently an effort to get this
> packaged in Fedora.
>
Awesome.

> >
> > > >
> > > > > I don't think it's as simple as saying "voila, UUID dependencies
> > > > > are removed, users are free to use arbitrary strings". We'd
> > > > > need to create some kind of naming policy, what characters are
> > > > > allows so that we can potentially expand the creation parameters
> > > > > as has been proposed a couple times, how do we deal with
> > > > > collisions and races, and why should we make such a change when
> > > > > a UUID is a perfectly reasonable devices name. Thanks,
> > > > >
> > > > Sure, we should define a policy on device naming to be more relaxed.
> > > > We have enough examples in-kernel.
> > > > Few that I am aware of are netdev (vxlan, macvlan, ipvlan, lot
> > > > more), rdma
> > > etc which has arbitrary device names and ID based device names.
> > > >
> > > > Collisions and race is already taken care today in the mdev core.
> > > > Same
> > > unique device names continue.
> > >
> > > I'm still completely missing a rationale _why_ uuids are supposedly
> > > bad/restricting/etc.
> > There is nothing bad about uuid based naming.
> > Its just too long name to derive phys_port_name of a netdev.
> > In details below.
> >
> > For a given mdev of networking type, we would like to have
> > (a) representor netdevice [2]
> > (b) associated devlink port [3]
> >
> > Currently these representor netdevice exist only for the PCIe SR-IOV VFs.
> > It is further getting extended for mdev without SR-IOV.
> >
> > Each of the devlink port is attached to representor netdevice [4].
> >
> > This netdevice phys_port_name should be a unique derived from some
> property of mdev.
> > Udev/systemd uses phys_port_name to derive unique representor netdev
> name.
> > This netdev name is further use by orchestration and switching software in
> user space.
> > One such distro supported switching software is ovs [4], which relies on the
> persistent device name of the representor netdevice.
>
> Ok, let me rephrase this to check that I understand this correctly. I'm not sure
> about some of the terms you use here (even after looking at the linked
> doc/code), but that's probably still ok.
>
> We want to derive an unique (and probably persistent?) netdev name so that
> userspace can refer to a representor netdevice. Makes sense.
> For generating that name, udev uses the phys_port_name (which represents
> the devlink port, IIUC). Also makes sense.
>
You understood it correctly.

> >
> > phys_port_name has limitation to be only 15 characters long.
> > UUID doesn't fit in phys_port_name.
>
> Understood. But why do we need to derive the phys_port_name from the mdev
> device name? This netdevice use case seems to be just one use case for using
> mdev devices? If this is a specialized mdev type for this setup, why not just
> expose a shorter identifier via an extra attribute?
>
Representor netdev, represents mdev's switch port (like PCI SRIOV VF's switch port).
So user must be able to relate this two objects in similar manner as SRIOV VFs.
Phys_port_name is derived from the PCI PF and VF numbering scheme.
Similarly mdev's such port should be derived from mdev's id/name/attribute.

> > Longer UUID names are creating snow ball effect, not just in networking stack
> but many user space tools too.
>
> This snowball effect mainly comes from the device name -> phys_port_name
> setup, IIUC.
>
Right.

> > (as opposed to recently introduced mdevctl, are they more mdev tools
> > which has dependency on UUID name?)
>
> I am aware that people have written scripts etc. to manage their mdevs.
> Given that the mdev infrastructure has been around for quite some time, I'd
> say the chance of some of those scripts relying on uuid names is non-zero.
>
Ok. but those scripts have never managed networking devices.
So those scripts won't break because they will always create mdev devices using UUID.
When they use these new networking devices, they need more things than their scripts.
So user space upgrade for such mixed mode case is reasonable.

> >
> > Instead of mdev subsystem creating such effect, one option we are
> considering is to have shorter mdev names.
> > (Similar to netdev, rdma, nvme devices).
> > Such as mdev1, mdev2000 etc.
> >
> > Second option I was considering is to have an optional alias for UUID based
> mdev.
> > This name alias is given at time of mdev creation.
> > Devlink port's phys_port_name is derived out of this shorter mdev name
> alias.
> > This way, mdev remains to be UUID based with optional extension.
> > However, I prefer first option to relax mdev naming scheme.
>
> Actually, I think that second option makes much more sense, as you avoid
> potentially breaking existing tooling.
Let's first understand of what exactly will break with existing tool if they see non_uuid based device.
>
Existing tooling continue to work with UUID devices.
Do you have example of what can break if they see non_uuid based device name?
I think you are clear, but to be sure, UUID based creation will continue to be there. Optionally mdev will be created with alpha-numeric string, if we don't it as additional attribute.

> >
> > > We want to uniquely identify a device, across different types of
> > > vendor drivers. An uuid is a unique identifier and even a
> > > well-defined one. Tools (e.g. mdevctl) are relying on it for mdev devices
> today.
> > >
> > > What is the problem you're trying to solve?
> > Unique device naming is still achieved without UUID scheme by various
> subsystems in kernel using alpha-numeric string.
> > Having such string based continue to provide unique names.
> >
> > I hope I described the problem and two solutions above.
> >
> > [1] https://github.com/awilliam/mdevctl
> > [2]
> > https://elixir.bootlin.com/linux/v5.3-rc4/source/drivers/net/ethernet/
> > mellanox/mlx5/core/en_rep.c [3]
> > http://man7.org/linux/man-pages/man8/devlink-port.8.html
> > [4]
> > https://elixir.bootlin.com/linux/v5.3-rc4/source/net/core/devlink.c#L6
> > 921
> > [5] https://www.openvswitch.org/
> >