Re: Mcast packet loss 2.6.8.1 kernel

From: Ahmed Kadwa
Date: Thu Nov 15 2007 - 07:14:30 EST


Thanks david,

The problem is this is an embedded system on arch=mips, with an au1000
micro running at about 400MHz. So, running on my PC with different
kernel still has too many unknowns to track down. Will give it a shot
though.

I have a hunch the problem has to do with scheduling/queueing or the IP
stack.

The test program is relatively large. I can post important bits though.
Problem is that there are many bits that make up the system to exhibit
the loss. We have:

Processes: IPC type:
-----------------------------------------------------
application TcpCLient

TcpServer
Custom protocol stack
UDP

interface layer
MCAST
--------------------------------------------------------- eth device

Thus lots of traffic is generated by 1 packet coming in on MCAST at the
interface layer, as it has to bubble up the chain.

My interface layer misses packets when tcpdump is NOT running ??
When I run tcpdump with promisc mode off (-p option) then tcpdump
reports packets missing as well as my interface layer (order of 10 in a
1000 packets lost).

Steps I will take:
1. Check if using udp instead of multicast has any effect

2. Think I am going to quantify the work required in upgrading/patching
the kernel.


Further info:
When I dont run the layers above, but just a multicast listener, I do
get all the packets, but I can also cause packet loss by artificially
loading the processor (running a load script that just adds 13 and
updates a counter variable)

Help is much appreciated, Thanks

Regards
A.K

On Thu, 2007-11-15 at 01:02 -0800, David Stevens wrote:
> Well, that output doesn't show any new (intentional) drops, but the kernel
> age means there could be an already-fixed bug, too.
>
> A couple things I'd suggest:
>
> 1) if you can run your program with a unicast address
> and see if you have similar drops
> 2) even if you can't update your entire kernel, if you can
> run the test program on a current kernel (even on a
> different system), you may get a hint whether it's a problem
> that has been fixed (figuring out which patch(es) you need is
> another matter!)
> 3) if the test program (sender and receiver) is reasonably small,
> you can post that here and I can try it on a current kernel
> to see if I can reproduce the results on different hardware, as
> well as look for any problems in the application.
>
> I don't think promiscuous mode directly relates, but tcpdump
> has an option to run with or without promiscuous mode. So you should
> be able to look at the packets both ways. But the IP receive count
> won't go up if there's a problem with device receives (like the
> multicast filter not matching), and you said when you had drops the
> packet count was correct, I believe.
>
> +-DLS
>
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html