regression(?): starting with 2.6.21 sending packets became broken.

From: Peter Volkov
Date: Sat Oct 13 2007 - 14:18:21 EST


Hello, all on the list.

Please CC me in answers, I'm not subscribed. Please, if this is wrong
list tell me what is correct.

Starting with 2.6.21 (or may be 2.6.20 as I have not tried it) kernel I
have problem that most tcp based services freeze at some point of
operation. I've noticed this first on ssh but then found out that at
lease one other service became similarly. The problem sites somewhere in
the kernel as I've compiled 2.6.19, 2.6.21, and 2.6.22 with the
similar .config options (of course not exact, as some options does not
exist in some kernels, but seems that enabled options are all the same)
but I have this problem only with the 21 and 22. I've tried to debug the
problem a bit, but not a lot as that is production box working as linux
based firewall/router.

First I took tcpdump. Although ssh connection to the router is not
always possible as it often hangs before I get into router, after some
attempts ssh connection was established. On client computer I've started
tcpdump and worked a bit until hang. tcpdump output showed me that when
I press any keys the packets are sent to the server and proper ack are
received. Later I found that all commands I enter blindly are executed
on router but I receive no reply packets with some data in them (pure
ack). That's why nothing happens on the screen and it looks like
hanging.

Now I've got to the router started ssh connection from router to some
other server. It hanged too. I attached strace and found that ssh
receive keyboard pressings (read() calls in the output) and writes them
further to the kernel (write() calls) but tcpdump on the router shows no
packets. So packets enter kernel and lost somewhere inside.

Now a information about my system. That's a pentium4 system with
hyper-threading enabled. cpuinfo and lspci output attached. kernel built
with "gcc version 4.1.2 (Gentoo 4.1.2 p1.0.2)" and binutils version
2.17. My .config file for all kernels I've mentioned is available here:

http://theor.ran.gpi.ru/linux-2.6.19-gentoo-r5-config (works)
http://theor.ran.gpi.ru/linux-2.6.21-gentoo-r4-config (not works)
http://theor.ran.gpi.ru/linux-2.6.22-gentoo-r8-config (not works)

Besides standard gentoo patchsets all kernels have IMQ and IPSET's
patches.

Does anybody have any idea what's going on with the latest kernels? How
to debug it further?

--
Peter.
00:00.0 Host bridge: Intel Corporation 82865G/PE/P DRAM Controller/Host-Hub Interface (rev 02)
00:01.0 PCI bridge: Intel Corporation 82865G/PE/P PCI to AGP Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV15 [GeForce2 GTS/Pro] (rev a4)
02:0a.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
02:0b.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
03:04.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
03:05.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
04:04.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
04:05.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping : 9
cpu MHz : 3198.784
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc
pni monitor ds_cpl cid xtpr
bogomips : 6401.59

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping : 9
cpu MHz : 3198.784
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 1
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc
pni monitor ds_cpl cid xtpr
bogomips : 6397.43

Attachment: signature.asc
Description: =?UTF-8?Q?=D0=AD=D1=82=D0=B0?==?UTF-8?Q?_=D1=87=D0=B0=D1=81=D1=82=D1=8C?==?UTF-8?Q?_=D1=81=D0=BE=D0=BE=D0=B1=D1=89=D0=B5=D0=BD=D0=B8=D1=8F?==?UTF-8?Q?_=D0=BF=D0=BE=D0=B4=D0=BF=D0=B8=D1=81=D0=B0=D0=BD=D0=B0?==?UTF-8?Q?_=D1=86=D0=B8=D1=84=D1=80=D0=BE=D0=B2=D0=BE=D0=B9?==?UTF-8?Q?_=D0=BF=D0=BE=D0=B4=D0=BF=D0=B8=D1=81=D1=8C=D1=8E?=