Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
From: Alex Williamson
Date: Sat Aug 24 2019 - 00:59:40 EST
On Sat, 24 Aug 2019 03:56:08 +0000
Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> > -----Original Message-----
> > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > Sent: Saturday, August 24, 2019 1:14 AM
> > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > Cc: Jiri Pirko <jiri@xxxxxxxxxxx>; Jiri Pirko <jiri@xxxxxxxxxxxx>; David S . Miller
> > <davem@xxxxxxxxxxxxx>; Kirti Wankhede <kwankhede@xxxxxxxxxx>; Cornelia
> > Huck <cohuck@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; cjia <cjia@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> >
> > On Fri, 23 Aug 2019 18:00:30 +0000
> > Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> >
> > > > -----Original Message-----
> > > > From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> > > > Sent: Friday, August 23, 2019 10:47 PM
> > > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > > Cc: Jiri Pirko <jiri@xxxxxxxxxxx>; Jiri Pirko <jiri@xxxxxxxxxxxx>;
> > > > David S . Miller <davem@xxxxxxxxxxxxx>; Kirti Wankhede
> > > > <kwankhede@xxxxxxxxxx>; Cornelia Huck <cohuck@xxxxxxxxxx>;
> > > > kvm@xxxxxxxxxxxxxxx; linux- kernel@xxxxxxxxxxxxxxx; cjia
> > > > <cjia@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx
> > > > Subject: Re: [PATCH v2 0/2] Simplify mtty driver and mdev core
> > > >
> > > > On Fri, 23 Aug 2019 16:14:04 +0000
> > > > Parav Pandit <parav@xxxxxxxxxxxx> wrote:
> > > >
> > > > > > > Idea is to have mdev alias as optional.
> > > > > > > Each mdev_parent says whether it wants mdev_core to generate
> > > > > > > an alias or not. So only networking device drivers would set it to true.
> > > > > > > For rest, alias won't be generated, and won't be compared
> > > > > > > either during creation time. User continue to provide only uuid.
> > > > > >
> > > > > > Ok
> > > > > >
> > > > > > > I am tempted to have alias collision detection only within
> > > > > > > children mdevs of the same parent, but doing so will always
> > > > > > > mandate to prefix in netdev name. And currently we are left
> > > > > > > with only 3 characters to prefix it, so that may not be good either.
> > > > > > > Hence, I think mdev core wide alias is better with 12 characters.
> > > > > >
> > > > > > I suppose it depends on the API, if the vendor driver can ask
> > > > > > the mdev core for an alias as part of the device creation
> > > > > > process, then it could manage the netdev namespace for all its
> > > > > > devices, choosing how many characters to use, and fail the
> > > > > > creation if it can't meet a uniqueness requirement. IOW,
> > > > > > mdev-core would always provide a full
> > > > > > sha1 and therefore gets itself out of the uniqueness/collision aspects.
> > > > > >
> > > > > This doesn't work. At mdev core level 20 bytes sha1 are unique, so
> > > > > mdev core allowed to create a mdev.
> > > >
> > > > The mdev vendor driver has the opportunity to fail the device
> > > > creation in mdev_parent_ops.create().
> > > >
> > > That is not helpful for below reasons.
> > > 1. vendor driver doesn't have visibility in other vendor's alias.
> > > 2. Even for single vendor, it needs to maintain global list of devices to see
> > collision.
> > > 3. multiple vendors needs to implement same scheme.
> > >
> > > Mdev core should be the owner. Shifting ownership from one layer to a
> > > lower layer in vendor driver doesn't solve the problem (if there is
> > > one, which I think doesn't exist).
> > >
> > > > > And then devlink core chooses
> > > > > only 6 bytes (12 characters) and there is collision. Things fall
> > > > > apart. Since mdev provides unique uuid based scheme, it's the mdev
> > > > > core's ownership to provide unique aliases.
> > > >
> > > > You're suggesting/contemplating multiple solutions here, 3-char
> > > > prefix + 12- char sha1 vs <parent netdev> + ?-char sha1. Also, the
> > > > 15-char total limit is imposed by an external subsystem, where the
> > > > vendor driver is the gateway between that subsystem and mdev. How
> > > > would mdev integrate with another subsystem that maybe only has
> > > > 9-chars available? Would the vendor driver API specify "I need an
> > > > alias" or would it specify "I need an X-char length alias"?
> > > Yes, Vendor driver should say how long the alias it wants.
> > > However before we implement that, I suggest let such
> > > vendor/user/driver arrive which needs that. Such variable length alias
> > > can be added at that time and even with that alias collision can be
> > > detected by single mdev module.
> >
> > If we agree that different alias lengths are possible, then I would request that
> > minimally an mdev sample driver be modified to request an alias with a length
> > that can be adjusted without recompiling in order to exercise the collision path.
> >
> Yes. this can be done. But I fail to understand the need to do so.
> It is not the responsibility of the mdev core to show case sha1
> collision efficiency/deficiency. So why do you insist exercise it?
I don't understand what you're trying to imply with "show case sha1
collision efficiency/deficiency". Are you suggesting that I'm asking
for this feature to experimentally test the probability of collisions
at different character lengths? We can use shell scripts for that.
I'm simply observing that collisions are possible based on user input,
but they're not practical to test for at the character lengths we're
using. Therefore, how do I tell QA to develop a tests to make sure the
kernel and userspace tools that might be involved behave correctly when
this rare event occurs?
As I mentioned previously, we can burn the cpu cyles to find some uuids
which will collide with our aliases, but the more accessible approach
seems to be to have a tune-able to reduce the alias address space such
that we can simply throw enough random uuids into the test to guarantee
a collision. Simply generating 10,000 devices with a 12-character
alias, as you suggested previously, has effectively a 0% probability of
generating a collision.
If we accept that different vendor drivers might have different alias
requirements, and therefore the vendor driver should have the ability
to specify an alias length, then this all fits very nicely into
modifying a sample driver to request a sufficiently short alias such
that we can use it to test the behavior of mdev-core and surrounding
code when an alias collision occurs.
> > If mdev-core is guaranteeing uniqueness, does this indicate that
> > each alias length constitutes a separate namespace? ie. strictly a
> > strcmp(), not a strncmp() to the shorter alias.
> >
> Yes.
>
>
> > > > Does it make sense that mdev-core would fail creation of a
> > > > device if there's a collision in the 12-char address space
> > > > between different subsystems? For example, does
> > > > enm0123456789ab really collide with xyz0123456789ab?
> > > I think so, because at mdev level its 12-char alias matters.
> > > Choosing the prefix not adding prefix is really a user space
> > > choice.
> > > > So if
> > > > mdev were to provided a 40-char sha1, is it possible that the
> > > > vendor driver could consume this in its create callback,
> > > > truncate it to the number of chars required by the vendor
> > > > driver's subsystem, and determine whether a collision exists?
> > > We shouldn't shift the problem from mdev to multiple vendor
> > > drivers to detect collision.
> > >
> > > I still think that user providing alias is better because it
> > > knows the use-case system in use, and eliminates these collision
> > > issue.
> >
> > How is a user provided alias immune from collisions? The burden is
> > on the user to provide both a unique uuid and a unique alias. That
> > makes it trivial to create a collision.
> >
> Than such collision should have occurred for other subsystem such as
> netdev while creating vlan, macvlan, ipvlan, vxlan and more devices
> who are named by the user. But that isn't the case.
>
> > > > > > > I do not understand how an extra character reduces
> > > > > > > collision, if that's what you meant.
> > > > > >
> > > > > > If the default were for example 3-chars, we might already
> > > > > > have device 'abc'. A collision would expose one more char
> > > > > > of the new device, so we might add device with alias
> > > > > > 'abcd'. I mentioned previously that this leaves an issue
> > > > > > for userspace that we can't change the alias of device abc,
> > > > > > so without additional information, userspace can only
> > > > > > determine via elimination the mapping of alias to device,
> > > > > > but userspace has more information available to it in the
> > > > > > form of sysfs links.
> > > > > > > Module options are almost not encouraged anymore with
> > > > > > > other subsystems/drivers.
> > > > > >
> > > > > > We don't live in a world of absolutes. I agree that the
> > > > > > defaults should work in the vast majority of cases.
> > > > > > Requiring a user to twiddle module options to make things
> > > > > > work is undesirable, verging on a bug. A module option to
> > > > > > enable some specific feature, unsafe condition, or test
> > > > > > that is outside of the typical use case is reasonable,
> > > > > > imo.
> > > > > > > For testing collision rate, a sample user space script and
> > > > > > > sample mtty is easy and get us collision count too. We
> > > > > > > shouldn't put that using module option in production
> > > > > > > kernel. I practically have the code ready to play with;
> > > > > > > Changing 12 to smaller value is easy with module reload.
> > > > > > >
> > > > > > > #define MDEV_ALIAS_LEN 12
> > > > > >
> > > > > > If it can't be tested with a shipping binary, it probably
> > > > > > won't be tested. Thanks,
> > > > > It is not the role of mdev core to expose collision
> > > > > efficiency/deficiency of the sha1. It can be tested outside
> > > > > before mdev choose to use it.
> > > >
> > > > The testing I'm considering is the user and kernel response to a
> > > > collision.
> > > > > I am saying we should test with 12 characters with 10,000 or
> > > > > more devices and see how collision occurs. Even if collision
> > > > > occurs, mdev returns EEXIST status indicating user to pick a
> > > > > different UUID for those rare conditions.
> > > >
> > > > The only way we're going to see collision with a 12-char sha1
> > > > is if we burn the CPU cycles to find uuids that collide in that
> > > > space. 10,000 devices is not remotely enough to generate a
> > > > collision in that address space. That puts a prerequisite in
> > > > place that in order to test collision, someone needs to know
> > > > certain magic inputs. OTOH, if we could use a shorter
> > > > abbreviation, collisions are trivial to test experimentally.
> > > > Thanks,
> > > Yes, and therefore a sane user who wants to create more mdevs,
> > > wouldn't intentionally stress it to see failures.
> >
> > I don't understand this logic. I'm simply asking that we have a
> > way to test the collision behavior without changing the binary.
> > The path we're driving towards seems to be making this easier and
> > easier. If the vendor can request an alias of a specific length,
> > then a sample driver with a module option to set the desired alias
> > length to 1-char makes it trivially easy to induce a collision.
> Sure it is easy to test collision, but my point is - mdev core is not
> sha1 test module. Hence adding functionality of variable alias length
> to test collision doesn't make sense. When the actual user arrives
> who needs small alias, we will be able to add additional pieces very
> easily.
>
> > It doesn't
> > even need to be exposed in a real driver. Besides, when do we ever
> > get to design interfaces that only worry about sane users???
> > Thanks,
> I intent to say that a sane user who wants to create mdev's will just
> work fine with less collision. If there is collision EEXIST is
> returns and sane user picks different UUID. If user is intentionally
> picking UUIDs in such a way that triggers sha1 collision, his
> intention is likely to not create mdevs for actual use. And if
> interface returns error code it is still fine.
This is exactly the scenarios that I'm asking "how do we test that it
works as we expect". I can test that passing identical uuids into the
mdev create interface only allows the first to succeed. With a 12-char
sha1 alias, it's not practical to construct a test to validate the
alias collision behavior. Do you suggest we rely only on code
inspection instead? Thanks,
Alex