On Mon, Jan 18, 2010 at 03:56:45PM -0500, Michael Breuer wrote:
On 1/18/2010 3:46 PM, Jarek Poplawski wrote:OK, let's try (as long as possible) if it can break so hard as with
On Mon, Jan 18, 2010 at 11:29:31AM -0500, Michael Breuer wrote:Yes - dk if it's significant or not. Only obvious difference between
Ok - up on the two patches, no DMAR. Some early observations:OK, you mentioned this oops (actually a warning only) happened during
1. There's an early on MMAP oops (see below). This happens once, at
the completion of the transition to runlevel 5 (I've seen it
entering runlevel 3 as well). This does not recur when runlevels are
subsequently changed. I do not see this when running with DMAR
previous tests too.
DMAR and not.
OK, but we need to establish some status quo after these patchesI had been focusing on the hangs - dhcp causing the initial crash2. The dropped tx packet (DHCP) is a bit harder to recreate, but itBtw, I guess you improved the test because you didn't mention it here,
even after my explicit question?:
from December. After things stabilized with the af patch& skb may
pull I started noticing the dropped tx packets. I reported the TX
loss on the 16th of January after confirming the issue.
before any new things (including DMAR), so I'd suggest trying this
config really longer and harder.
Please check "tc -s qdisc" each time as well.Yes. There are no errors, and no statistics anywhere that I know toInterestingly, I initially saw no dropped packetsI forgot to mention, but did you try to check if these lost ping
with ping - but after I went the DCHP route and eventually
reconnected, I could then cause dropped tx packets with ping. To
a) start throughput
b) ping device - no packet loss - this was true for the entire test run.
c) start throughput again
d) ping - no loss.
e) drop wifi on the device& restart - first attempt worked. Repeat
attempt yielded the dropped DHCPOFFER packets. After about 6 tries,
the device reconnected to wifi.
f) ping again (after the reconnection) - packet loss rate about 80%.
g) simultaneously ping the wifi router - no loss.
h) After a while, packets are no longer dropped during ping. If I
manage to cause the dhcp drop again, and then ping after the device
finally reconnects, packet loss is significant for a while (maybe 30
sec to a minute). Then things return to normal. Note that the packet
loss continues even if the reported throughput drops to nil.
i) I can't cause the initial packet loss at RX rates below about
30,000KBPS (as reported by nethogs). At rates over 40 I can
reproduce this on this set of patches& config about 60% of the
packets are "being dropped somewhere after wireshark sees them and
before hitting the wire" like DHCPOFFER? Aren't there any sky2
warnings/resets while this happens?
look reflect the loss. Nothing in netstat; ethtool -S; etc. The only
loss reported is RX. The recent TX warnings/resets happened while
the machine was up for several days and while unattended and under
high RX load.