Re: [PATCH v3 1/2] net: core: Notify on changes to dev->promiscuity.

From: Ido Schimmel
Date: Thu Aug 29 2019 - 15:36:21 EST


On Thu, Aug 29, 2019 at 08:29:57PM +0200, Andrew Lunn wrote:
> > Hi Andrew,
> >
> > What happens when you run tcpdump on a routed interface without putting
> > it in promiscuous mode ('-p')? If it is a pure software switch, then you
> > see all unicast packets addressed to your interface's MAC address. What
> > happens when the same is done on a hardware switch? With the proposed
> > solution you will not get the same result.
> >
> > On a software switch, when you run tcpdump without '-p', do you incur
> > major packet loss? No. Will this happen when you punt several Tbps to
> > your CPU on the hardware switch? Yes.
>
> Hi Ido
>
> Please think about the general case, not your hardware. A DSA switch
> generally has 1G ports. And the connection to the host is generally
> 1G, maybe 2.5G. So if i put one interface into promisc mode, i will
> probably receive the majority of the traffic on that port, so long as
> there is not too much traffic from other ports towards the CPU.
>
> I also don't expect any major packet loss in the switch. It is still
> hardware switching, but also sending a copy to the CPU. That copy will
> have the offload_fwd_mark bit set, so the bridge will discard the
> frame. The switch egress queue towards the CPU might overflow, but
> that means tcpdump does not get to see all the frames, and some
> traffic which is actually heading to the CPU is lost. But that can
> happen anyway.

The potential packet loss was only one example why using promiscuous
mode as an indication to punt all traffic to the CPU is wrong. I also
mentioned that you will not capture any traffic (besides
control/exception) when '-p' is specified.

> We should also think about the different classes of users. Somebody
> using a TOR switch with a NOS is very different to a user of a SOHO
> switch in their WiFi access point. The first probably knows tc very
> well, the second has probably never heard of it, and just wants
> tcpdump to work like on their desktop.

I fully agree that we should make it easy for users to capture offloaded
traffic, which is why I suggested patching libpcap. Add a flag to
capable netdevs that tells libpcap that in order to capture all the
traffic from this interface it needs to add a tc filter with a trap
action. That way zero familiarity with tc is required from users.

I really believe that instead of interpreting IFF_PROMISC in exotic ways
and pushing all this logic into the kernel, we should instead teach user
space utilities to capture offloaded traffic.