Re: [PATCH] net: dev_forward_skb(): Scrub packet's per-netns info only when crossing netns

From: Daniel Borkmann
Date: Thu Mar 15 2018 - 10:53:18 EST


On 03/15/2018 03:35 PM, Roman Mashak wrote:
> Liran Alon <liran.alon@xxxxxxxxxx> writes:
> [...]
>>> Overall I think it might be nice to not need scrubbing skb in such
>>> cases,
>>> although my concern would be that this has potential to break
>>> existing
>>> setups when they would expect mark being zero on other veth peer in
>>> any
>>> case since it's the behavior for a long time already. The safer
>>> option
>>> would be to have some sort of explicit opt-in e.g. on link creation to
>>> let
>>> the skb->mark pass through unscrubbed. This would definitely be a
>>> useful
>>> option e.g. when mark is set in the netns facing veth via
>>> clsact/egress
>>> on xmit and when the container is unprivileged anyway.
>>>
>>> Thanks,
>>> Daniel
>>
>> I see your point in regards to backwards comparability.
>> However, not scrubbing skb when it cross netns via some kernel functions compared to
>> others is basically a bug which could easily break with a little bit of more refactoring.
>> Therefore, it seems a bit weird to me to from now on, we will force
>> every user on link creation to consider that once there was a bug leading
>> to this weird behavior on specific netdevs.

Why bug specifically? It could well be that for some unpriv containers
it would be fine to do e.g. in cases where orchestrator sets up clsact/
egress on veth/ipvlan/etc in the container to set the mark and where app
cannot mess with this while for others you need to act out of host facing
veth; thus, explicit opt-in per dev could provide such more fine grained
control.

> One valid use case could be preserving a source namespace nsid in
> skb->mark when a packet crosses netns.

Right, was thinking about something similar.