RE: [PATCH v2 0/2] Simplify mtty driver and mdev core

From: Parav Pandit
Date: Fri Aug 23 2019 - 14:00:40 EST




> -----Original Message-----
> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Friday, August 23, 2019 10:47 PM
> To: Parav Pandit <parav@xxxxxxxxxxxx>
> Cc: Jiri Pirko <jiri@xxxxxxxxxxx>; Jiri Pirko <jiri@xxxxxxxxxxxx>; David S . Miller
> <davem@xxxxxxxxxxxxx>; Kirti Wankhede <kwankhede@xxxxxxxxxx>; Cornelia
> Huck <cohuck@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; cjia <cjia@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
>
> On Fri, 23 Aug 2019 16:14:04 +0000
> Parav Pandit <parav@xxxxxxxxxxxx> wrote:
>
> > > > Idea is to have mdev alias as optional.
> > > > Each mdev_parent says whether it wants mdev_core to generate an
> > > > alias or not. So only networking device drivers would set it to true.
> > > > For rest, alias won't be generated, and won't be compared either
> > > > during creation time. User continue to provide only uuid.
> > >
> > > Ok
> > >
> > > > I am tempted to have alias collision detection only within
> > > > children mdevs of the same parent, but doing so will always
> > > > mandate to prefix in netdev name. And currently we are left with
> > > > only 3 characters to prefix it, so that may not be good either.
> > > > Hence, I think mdev core wide alias is better with 12 characters.
> > >
> > > I suppose it depends on the API, if the vendor driver can ask the
> > > mdev core for an alias as part of the device creation process, then
> > > it could manage the netdev namespace for all its devices, choosing
> > > how many characters to use, and fail the creation if it can't meet a
> > > uniqueness requirement. IOW, mdev-core would always provide a full
> > > sha1 and therefore gets itself out of the uniqueness/collision aspects.
> > >
> > This doesn't work. At mdev core level 20 bytes sha1 are unique, so
> > mdev core allowed to create a mdev.
>
> The mdev vendor driver has the opportunity to fail the device creation in
> mdev_parent_ops.create().
>
That is not helpful for below reasons.
1. vendor driver doesn't have visibility in other vendor's alias.
2. Even for single vendor, it needs to maintain global list of devices to see collision.
3. multiple vendors needs to implement same scheme.

Mdev core should be the owner. Shifting ownership from one layer to a lower layer in vendor driver doesn't solve the problem
(if there is one, which I think doesn't exist).

> > And then devlink core chooses
> > only 6 bytes (12 characters) and there is collision. Things fall
> > apart. Since mdev provides unique uuid based scheme, it's the mdev
> > core's ownership to provide unique aliases.
>
> You're suggesting/contemplating multiple solutions here, 3-char prefix + 12-
> char sha1 vs <parent netdev> + ?-char sha1. Also, the 15-char total limit is
> imposed by an external subsystem, where the vendor driver is the gateway
> between that subsystem and mdev. How would mdev integrate with another
> subsystem that maybe only has 9-chars available? Would the vendor driver API
> specify "I need an alias" or would it specify "I need an X-char length alias"?
Yes, Vendor driver should say how long the alias it wants.
However before we implement that, I suggest let such vendor/user/driver arrive which needs that.
Such variable length alias can be added at that time and even with that alias collision can be detected by single mdev module.

> Does it make sense that mdev-core would fail creation of a device if there's a
> collision in the 12-char address space between different subsystems? For
> example, does enm0123456789ab really collide with xyz0123456789ab?
I think so, because at mdev level its 12-char alias matters.
Choosing the prefix not adding prefix is really a user space choice.

> So if
> mdev were to provided a 40-char sha1, is it possible that the vendor driver
> could consume this in its create callback, truncate it to the number of chars
> required by the vendor driver's subsystem, and determine whether a collision
> exists?
We shouldn't shift the problem from mdev to multiple vendor drivers to detect collision.

I still think that user providing alias is better because it knows the use-case system in use, and eliminates these collision issue.

>
> > > > I do not understand how an extra character reduces collision, if
> > > > that's what you meant.
> > >
> > > If the default were for example 3-chars, we might already have
> > > device 'abc'. A collision would expose one more char of the new
> > > device, so we might add device with alias 'abcd'. I mentioned
> > > previously that this leaves an issue for userspace that we can't
> > > change the alias of device abc, so without additional information,
> > > userspace can only determine via elimination the mapping of alias to
> > > device, but userspace has more information available to it in the
> > > form of sysfs links.
> > > > Module options are almost not encouraged anymore with other
> > > > subsystems/drivers.
> > >
> > > We don't live in a world of absolutes. I agree that the defaults
> > > should work in the vast majority of cases. Requiring a user to
> > > twiddle module options to make things work is undesirable, verging
> > > on a bug. A module option to enable some specific feature, unsafe
> > > condition, or test that is outside of the typical use case is
> > > reasonable, imo.
> > > > For testing collision rate, a sample user space script and sample
> > > > mtty is easy and get us collision count too. We shouldn't put that
> > > > using module option in production kernel. I practically have the
> > > > code ready to play with; Changing 12 to smaller value is easy with
> > > > module reload.
> > > >
> > > > #define MDEV_ALIAS_LEN 12
> > >
> > > If it can't be tested with a shipping binary, it probably won't be
> > > tested. Thanks,
> > It is not the role of mdev core to expose collision
> > efficiency/deficiency of the sha1. It can be tested outside before
> > mdev choose to use it.
>
> The testing I'm considering is the user and kernel response to a collision.
>
> > I am saying we should test with 12 characters with 10,000 or more
> > devices and see how collision occurs. Even if collision occurs, mdev
> > returns EEXIST status indicating user to pick a different UUID for
> > those rare conditions.
>
> The only way we're going to see collision with a 12-char sha1 is if we burn the
> CPU cycles to find uuids that collide in that space. 10,000 devices is not
> remotely enough to generate a collision in that address space. That puts a
> prerequisite in place that in order to test collision, someone needs to know
> certain magic inputs. OTOH, if we could use a shorter abbreviation, collisions
> are trivial to test experimentally. Thanks,
>
Yes, and therefore a sane user who wants to create more mdevs, wouldn't intentionally stress it to see failures.

> Alex