Re: Combining bridging, 802.1q, and tap

From: Garry Dolley
Date: Wed Apr 15 2009 - 21:27:36 EST


On Wed, Apr 15, 2009 at 07:53:04PM -0500, Chris Adams wrote:
> Once upon a time, Garry Dolley <gdolley@xxxxxxxxxxxxxxx> said:
> > So you have something like:
> >
> > ------ --------
> > | | tap0 ----> br0 ----> eth0 | |
> > | VM | tap1 ----> br1 ----> eth1 | Host |
> > | | tap2 ----> br2 ----> eth2 | |
> > ------ --------
> >
> > Correct?
>
> Not exactly. More like:
>
> --------
> | | eth0 --> br0
> | Host | eth1 --> br1
> | | eth2 --> br2 (VLANed with br2.20 and br2.30)
> --------
>
> --------
> | | eth0 --> host tap0 --> br0
> | KVM | eth1 --> host tap1 --> br1
> | QEMU | eth2 --> host tap2 --> br2
> | | (VLANed in the VM with eth2.20 and eth2.30)
> --------
>
> In the host, I see:
>
> # brctl show
> bridge name bridge id STP enabled interfaces
> br0 8000.0002b3c1c9aa no eth0
> tap0
> br1 8000.0030bdb23c63 no eth1
> tap1
> br2 8000.0004614aee26 no eth2
> tap2
> # cat /proc/net/vlan/config
> VLAN Dev name | VLAN ID
> Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
> br2.20 | 20 | br2
> br2.30 | 30 | br2
>
>
> In the VM, I see (no bridging here):
> # cat /proc/net/vlan/config
> VLAN Dev name | VLAN ID
> Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
> eth2.20 | 20 | eth2
> eth2.30 | 30 | eth2

Roger that.

This setup looks good so far.

> > First of all, show us the tcpdump command you're running.
>
> I'm running "tcpdump -s0 -e -n -i eth2". If I run it in the host and
> ping from the host to something on the LAN, I see:
>
> 19:00:16.629191 00:04:61:4a:ee:26 > Broadcast, ethertype 802.1Q (0x8100), length 46: vlan 20, p 0, ethertype ARP, arp who-has 172.24.54.14 tell 172.24.54.206
> 19:00:16.629420 00:30:48:22:9c:d1 > 00:04:61:4a:ee:26, ethertype 802.1Q (0x8100), length 64: vlan 20, p 0, ethertype ARP, arp reply 172.24.54.14 is-at 00:30:48:22:9c:d1
> 19:00:16.629477 00:04:61:4a:ee:26 > 00:30:48:22:9c:d1, ethertype 802.1Q (0x8100), length 102: vlan 20, p 0, ethertype IPv4, 172.24.54.206 > 172.24.54.14: ICMP echo request, id 49703, seq 1, length 64
> 19:00:16.630770 00:30:48:22:9c:d1 > 00:04:61:4a:ee:26, ethertype 802.1Q (0x8100), length 102: vlan 20, p 0, ethertype IPv4, 172.24.54.14 > 172.24.54.206: ICMP echo reply, id 49703, seq 1, length 64

Yup, as expected.

> If I run tcpdump in the VM and ping from the VM, I see:
>
> 19:02:04.443160 00:04:61:4a:ee:27 > Broadcast, ethertype ARP (0x0806), length 42: arp who-has 172.24.54.14 tell 172.24.54.207
>
>
> I swear I saw tagged packets within the VM earlier. :-(

Are you running tcpdump in the VM on eth2?

If so, you won't see tagged packets on eth2. You'll see the
untagged packets on their respective VLANs on eth2.20 and eth2.30.
On eth2, you'll see traffic from both of these VLANs, but they'll
appear *untagged*. You might expect to see tagged packets (as I
did), but the 8021q module wasn't designed to behave this way from
what I can tell.

In my experience while testing on Ubuntu 8.10 with stock kernel,
once any VLAN is configured on a physical interface, that physical
interface no longer can be used to see tagged packets.

I got around this by just enslaving the physical int to a bridge,
and doing VLANs on the bridge. I could see all tagged traffic on
the bridge since it just sees the physical int traffic, and the
physical int had no VLANs on it (instead they are on the bridge).

So eth2.20 == bad, br2.20 == good.

Your host looks configured this way, so that's good. In fact I
believe it is the only way your VM can see tagged traffic.

> Okay, if I watch eth2 and eth2.20 with the same tcpdump command as
> above, I see incoming packets correctly. On eth2, I see the tag, and
> then they show up on eth2.20 without the tag. It appears to only be a
> problem with outbound packets not getting tagged (I see the same
> untagged packets in the host with a tcpdump on tap2).

Are you sure you see the tag on eth2? I woulda killed to see the
tag on the physical int when doing my setup a few weeks ago.

I wonder if VLAN hardware acceleration has anything to do with it.
In my setup, I had problems with seeing tags on the *host*, which
has access to the physical NIC w/ HW VLAN acceleration. In your
setup, your NIC is virtual in the VM, and maybe different properties
apply (wonder how the virtual NIC looks to the VM in terms of HW
VLAN acceleration).

> Any ideas why the VM wouldn't be tagging properly? It appears to be
> configured correctly. The VM system is RHEL5.3, with the latest kernel
> (kernel-2.6.18-128.1.6.el5.x86_64). I don't have a non-virtual RHEL5
> system I can put my hands on at the momet to test there to see if this
> is a general bug.

Can you show a tcpdump on the host when pinging from inside the VM?

Ultimately, your setup is possible. I have the same setup working
in production *but* my VM isn't Linux, it's OpenBSD. I'm using the
vlan pseudo-interface in OpenBSD and I'm passing VLAN tags all over
the place. Works great. I imagine the same is possible w/ a Linux
VM. My host is Ubuntu 8.10.

--
Garry Dolley
ARP Networks, Inc. | http://www.arpnetworks.com | (818) 206-0181
Data center, VPS, and IP Transit solutions
Member Los Angeles County REACT, Unit 336 | WQGK336
Blog http://scie.nti.st
--
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html