RE: [PATCH 3/3] hv_netvsc: Implement VF matching based on serial numbers
From: KY Srinivasan
Date: Thu Dec 15 2016 - 20:12:58 EST
> -----Original Message-----
> From: devel [mailto:driverdev-devel-bounces@xxxxxxxxxxxxxxxxxxxxxx] On
> Behalf Of Stephen Hemminger
> Sent: Wednesday, December 14, 2016 3:52 PM
> To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: olaf@xxxxxxxxx; jasowang@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> bjorn.helgaas@xxxxxxxxx; apw@xxxxxxxxxxxxx;
> devel@xxxxxxxxxxxxxxxxxxxxxx; leann.ogasawara@xxxxxxxxxxxxx; Haiyang
> Zhang <haiyangz@xxxxxxxxxxxxx>
> Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based on serial
> numbers
>
> On Wed, 14 Dec 2016 15:27:58 -0800
> Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Wed, Dec 14, 2016 at 11:18:59PM +0000, Haiyang Zhang wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
> > > > Sent: Saturday, December 10, 2016 7:21 AM
> > > > To: Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx>
> > > > Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; olaf@xxxxxxxxx;
> > > > jasowang@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > > bjorn.helgaas@xxxxxxxxx; apw@xxxxxxxxxxxxx;
> devel@xxxxxxxxxxxxxxxxxxxxxx;
> > > > leann.ogasawara@xxxxxxxxxxxxx
> > > > Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based on
> > > > serial numbers
> > > >
> > > > On Fri, Dec 09, 2016 at 04:21:48PM -0800, Stephen Hemminger wrote:
> > > > > On Fri, 9 Dec 2016 22:35:05 +0000
> > > > > Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > > > > >
> > > > > > > > > Emulated NIC is already excluded in start of netvc notifier
> > > > handler.
> > > > > > > > >
> > > > > > > > > static int netvsc_netdev_event(struct notifier_block *this,
> > > > > > > > > unsigned long event, void *ptr)
> > > > > > > > > {
> > > > > > > > > struct net_device *event_dev =
> > > > netdev_notifier_info_to_dev(ptr);
> > > > > > > > >
> > > > > > > > > /* Skip our own events */
> > > > > > > > > if (event_dev->netdev_ops == &device_ops)
> > > > > > > > > return NOTIFY_DONE;
> > > > > > > > >
> > > > > > > >
> > > > > > > > Emulated device is not based on netvsc. It's the native Linux
> > > > > > > (dec100M?)
> > > > > > > > Driver. So this line doesn't exclude it. And how about other NIC
> > > > type
> > > > > > > > may be added in the future?
> > > > > > >
> > > > > > > Sorry, forgot about that haven't used emulated device in years.
> > > > > > > The emulated device should appear to be on a PCI bus, but the
> > > > serial
> > > > > > > would not match??
> > > > > >
> > > > > > It's not a vmbus device, not a hv_pci device either. Hv_PCI is a
> > > > subset
> > > > > > of vmbus devices. So emulated NIC won't have hv_pci serial
> number.
> > > > > >
> > > > > > In my patch, the following code ensure, we only try to get serial
> > > > number
> > > > > > after confirming it's vmbus and hv_pci device:
> > > > > >
> > > > > > + if (!dev_is_vmbus(dev))
> > > > > > + continue;
> > > > > > +
> > > > > > + hdev = device_to_hv_device(dev);
> > > > > > + if (hdev->device_id != HV_PCIE)
> > > > > > + continue;
> > > > >
> > > > > Ok, the walk back up the device tree is logically ok, but I don't
> > > > > know enough about PCI device tree to be assured that it is safe.
> > > > > Also, you could short circuit away most of the unwanted devices
> > > > > by making sure the vf_netdev->dev.parent is a PCI device.
> > > >
> > > > Ugh, this seems really really messy. Can't we just have the
> > > > netdev_event interface pass back a pointer to something that we
> "know"
> > > > what it is? This walking the device tree is a mess, and not good.
> > > >
> > > > I'd even argue that dev_is_pci() needs to be removed from the tree
> too,
> > > > as it shouldn't be needed either. We did a lot of work on the driver
> > > > model to prevent the need for having to declare the "type" of 'struct
> > > > device' at all, and by doing this type of thing it goes against the
> > > > basic design of the model.
> > > >
> > > > Yes, it makes things a bit "tougher" in places, but you don't do crazy
> > > > things like walk device trees to try to find random devices and then
> > > > think it's safe to actually use them :(
> > > >
> > >
> > > We register a notifier_block with:
> > > register_netdevice_notifier(struct notifier_block *nb)
> > >
> > > The "struct notifier_block" basically contains a callback function:
> > > struct notifier_block {
> > > notifier_fn_t notifier_call;
> > > struct notifier_block __rcu *next;
> > > int priority;
> > > };
> > >
> > > It doesn't specify which device we want, so all net devices can trigger
> > > this event. Seems we can't have this notifier return VF device only.
> >
> > Ok, I dug in the kernel and it looks like people check the netdev_ops
> > structure to see if it matches up with their function pointers to "know"
> > if this is their device or not. Why not do that here? Or compare the
> > "string" of the driver name? Or any other such trick that the drivers
> > that call register_netdevice_notifier do?
> >
> > All of which are more sane than walking the device tree...
> >
> > And why am I having to do network driver development, ick ick ick :)
> >
> > Come on, 'git grep' is your friend. Even better yet, use a good tool
> > like 'vgrep' which makes git grep work really really well.
>
> Normally, that would work but in this case we have one driver (netvsc)
> which is managing another driver which is unaware of Hyper-V or netvsc
> drivers existence. The callback is happening in netvsc driver and it
> needs to say "hey I know that SR-IOV device, it is associated with my
> network device". This problem is how to know that N is associated with
> V? The V device has to be a network device, that is easy. But then it
> also has to be a PCI device, not to bad. But then the netvsc code
> is matching based on hyper-V only PCI bus metadata (the serial #).
>
> The Microsoft developers made the rational decision not to go modifying
> all the possible SR-IOV network devices from Intel and Mellanox to add
> the functionality there. That would have been much worse.
>
> Maybe, rather than trying to do the management in the kernel it
> could have been done better in user space. Unfortunately, this would
> only move the problem. The PCI-hyperv host driver could expose serial
> value through sysfs (with some pain). But the problem would be how
> to make a new API to join the two V and N device. Doing a private
> ioctl is worse than the notifier.
All this has been discussed earlier in the thread. I think I have a solution
to the problem:
The only PCI (non-VF) NIC that may be present in the VM is the emulated NIC and
we know exactly the device ID and vendor ID of this NIC. Furthermore,
as a platform we are not going to be emulating additional NICs. So,
if the PCI NIC is not the emulated NIC, it must be a VF and we can extract the
serial number.
Regards,
K. Y
> _______________________________________________
> devel mailing list
> devel@xxxxxxxxxxxxxxxxxxxxxx
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdriverd
> ev.linuxdriverproject.org%2Fmailman%2Flistinfo%2Fdriverdev-
> devel&data=02%7C01%7Ckys%40microsoft.com%7C77c2c8a38fe2431945e408
> d4247c2c7d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63617356
> 3122444667&sdata=u5C0v7ixzRu%2Btw51tTzHNpbsNqCDQTpigzUtwahIPvE%
> 3D&reserved=0