Re: RFC: Network Plugin Architecture (NPA) for vmxnet3

From: Avi Kivity
Date: Wed May 05 2010 - 14:00:40 EST

Next message: Linus Torvalds: "Re: [PATCH 1/2] mm,migration: Prevent rmap_walk_[anon|ksm] seeingthe wrong VMA information"
Previous message: Linus Torvalds: "Re: [PATCH 1/2] mm,migration: Prevent rmap_walk_[anon|ksm] seeingthe wrong VMA information"
In reply to: Christoph Hellwig: "Re: [Pv-drivers] RFC: Network Plugin Architecture (NPA) for vmxnet3"
Next in thread: Pankaj Thakkar: "Re: RFC: Network Plugin Architecture (NPA) for vmxnet3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 05/05/2010 02:02 AM, Pankaj Thakkar wrote:

2. Hypervisor control: All control operations from the guest such as programming
MAC address go through the hypervisor layer and hence can be subjected to
hypervisor policies. The PF driver can be further used to put policy decisions
like which VLAN the guest should be on.

Is this enforced? Since you pass the hardware through, you can't rely on the guest actually doing this, yes?

The plugin image is provided by the IHVs along with the PF driver and is
packaged in the hypervisor. The plugin image is OS agnostic and can be loaded
either into a Linux VM or a Windows VM. The plugin is written against the Shell
API interface which the shell is responsible for implementing. The API
interface allows the plugin to do TX and RX only by programming the hardware
rings (along with things like buffer allocation and basic initialization). The
virtual machine comes up in paravirtualized/emulated mode when it is booted.
The hypervisor allocates the VF and other resources and notifies the shell of
the availability of the VF. The hypervisor injects the plugin into memory
location specified by the shell. The shell initializes the plugin by calling
into a known entry point and the plugin initializes the data path. The control
path is already initialized by the PF driver when the VF is allocated. At this
point the shell switches to using the loaded plugin to do all further TX and RX
operations. The guest networking stack does not participate in these operations
and continues to function normally. All the control operations continue being
trapped by the hypervisor and are directed to the PF driver as needed. For
example, if the MAC address changes the hypervisor updates its internal state
and changes the state of the embedded switch as well through the PF control
API.

This is essentially a miniature network stack with a its own mini bonding layer, mini hotplug, and mini API, except s/API/ABI/. Is this a correct view?

If so, the Linuxy approach would be to use the ordinary drivers and the Linux networking API, and hide the bond setup using namespaces. The bond driver, or perhaps a new, similar, driver can be enhanced to propagate ethtool commands to its (hidden) components, and to have a control channel with the hypervisor.

This would make the approach hypervisor agnostic, you're just pairing two devices and presenting them to the rest of the stack as a single device.

We have reworked our existing Linux vmxnet3 driver to accomodate NPA by
splitting the driver into two parts: Shell and Plugin. The new split driver is

So the Shell would be the reworked or new bond driver, and Plugins would be ordinary Linux network drivers.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: [PATCH 1/2] mm,migration: Prevent rmap_walk_[anon|ksm] seeingthe wrong VMA information"
Previous message: Linus Torvalds: "Re: [PATCH 1/2] mm,migration: Prevent rmap_walk_[anon|ksm] seeingthe wrong VMA information"
In reply to: Christoph Hellwig: "Re: [Pv-drivers] RFC: Network Plugin Architecture (NPA) for vmxnet3"
Next in thread: Pankaj Thakkar: "Re: RFC: Network Plugin Architecture (NPA) for vmxnet3"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]