Re: [PATCH V5 2/6 net-next] netdevice.h: Add zero-copy flag innetdevice
From: Michael S. Tsirkin
Date: Wed May 18 2011 - 07:56:11 EST
On Wed, May 18, 2011 at 01:47:33PM +0200, MichaÅ MirosÅaw wrote:
> W dniu 18 maja 2011 13:17 uÅytkownik Michael S. Tsirkin
> <mst@xxxxxxxxxx> napisaÅ:
> > On Wed, May 18, 2011 at 01:10:50PM +0200, MichaÅ MirosÅaw wrote:
> >> 2011/5/18 Michael S. Tsirkin <mst@xxxxxxxxxx>:
> >> > On Tue, May 17, 2011 at 03:28:38PM -0700, Shirley Ma wrote:
> >> >> On Tue, 2011-05-17 at 23:48 +0200, MichaÅ MirosÅaw wrote:
> >> >> > 2011/5/17 Shirley Ma <mashirle@xxxxxxxxxx>:
> >> >> > > Hello Michael,
> >> >> > >
> >> >> > > Looks like to use a new flag requires more time/work. I am thinking
> >> >> > > whether we can just use HIGHDMA flag to enable zero-copy in macvtap
> >> >> > to
> >> >> > > avoid the new flag for now since mavctap uses real NICs as lower
> >> >> > device?
> >> >> >
> >> >> > Is there any other restriction besides requiring driver to not recycle
> >> >> > the skb? Are there any drivers that recycle TX skbs?
> >> > Not just recycling skbs, keeping reference to any of the pages in the
> >> > skb. Another requirement is to invoke the callback
> >> > in a timely fashion. ÂFor example virtio-net doesn't limit the time until
> >> > that happens (skbs are only freed when some other packet is
> >> > transmitted), so we need to avoid zcopy for such (nested-virt)
> >> > scenarious, right?
> >> Hmm. But every hardware driver supporting SG will keep reference to
> >> the pages until the packet is sent (or DMA'd to the device). This can
> >> take a long time if hardware queue happens to stall for some reason.
> > That's a fundamental property of zero copy transmit.
> > You can't let the application/guest reuse the memory until
> > no one looks at it anymore.
> One more question: is userspace (or whatever is sending those packets)
> denied from modifying passed pages? I assume it is, but just want to
> be sure.
> Best Regards,
> MichaÅ MirosÅaw
It's not denied in the sense that it still can modify them if it's
buggy (the pages might not be read-only).
But well-behaved userspace won't modify them until the callback
That would be a problem if the underlying device is
a bridge where we might try to e.g. filter these packets -
data can get modified after the filter. We'd have to copy
whatever the filter accesses and use the copy - it's rarely
the data itself.
That's not normally a problem for macvtap connected to a physical NIC,
as that already bypasses any and all filtering.
But that's another limitation we should note in the comment,
and another reason to limit to specific devices.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/