Re: [PATCH 1/4] veth: move loopback logic to common location

From: Patrick McHardy
Date: Tue Nov 24 2009 - 11:56:29 EST


Eric W. Biederman wrote:
> Patrick McHardy <kaber@xxxxxxxxx> writes:
>
>>>>> I did all my testing with macvlan interfaces in separate namespaces
>>>>> communicating with each other, so I'd assume that we should always
>>>>> clear skb->mark and skb->dst in this function.
>>>> Good point, in that case we probably should clear it as well. But
>>>> in the non-namespace case the TC classification currently works and
>>>> this is consistent with any other virtual device driver, so it
>>>> should continue to work.
>>> Do you think we should be able to use TC to direct traffic between
>>> macvlans on the same underlying device in bridge mode? It does sound
>>> useful, but I'm not sure how to implement that or if you'd expect
>>> it to work with the current code. If we support that, it should probably
>>> also work with namespaces, by consuming the mark in the macvlan
>>> and veth drivers.
>> I don't think its necessary, we bypass outgoing queuing anyways.
>> But if you'd want to add it, just keeping the skb->mark clearing
>> in veth should work from what I can tell.
>
> veth doesn't have an outgoing queue. The reason we clear skb->mark
> in veth is because when reentering the networking stack the packet
> needs to be reclassified. At the point of loopback we are talking
> a packet that has at least logically gone out of the machine on a
> wire and come back into the machine on another physical interface.
>
> So it seems to me we should have consistent handling for macvlans,
> veth, for the cases where we are looping packets back around. In
> practice I expect all of those cases are going to be cross namespace
> as otherwise we would have intercepted the packet before going
> out a physical interface.

Agreed on the looping case, that's what we're doing now.

In the layered case (macvlan -> eth0) its common behaviour to
keep the mark however. But in case of different namespaces,
I think macvlan should also clear the mark on the dev_queue_xmit()
path since this is just a shortcut to looping the packets
through veth. In fact probably both of them should also clear
skb->priority so other namespaces don't accidentally misclassify
packets.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/