IP Masq code BUG? (Was: Connection reset by peer error's)

Dave Cinege (dcinege@psychosis.com)
Fri, 28 May 1999 01:14:15 -0400


I have found a problem I can respoduce that seems to be related to the IP Masq
kernel code.

(Please read my earlier email below so you can understand this)

Procedure:
I commented out all IP masq kernel modules, and ipfwadm config
lines on both my servers. I *rebooted* both servers.

>From my host I traceroute www.netscape.com. It dies at server2. IP Masq is
definitely OFF.

>From my Solaris machine I telnet into server 2. OK.
I then telnet into server 1. *OK*

I run the following on server 2:
ipfwadm-wrapper -F -f
ipfwadm-wrapper -F -p deny
ipfwadm-wrapper -F -a m -S "192.168.0.0"/24 -D 0.0.0.0/0

>From my Solaris machine I telnet into server 2. OK.
I then telnet into server 1. It fails with "SetSockOpt: Invalid argument"

My guess is:
When IP Masq is enable and the machine forwards traffic that is NOT to be
masq'ed, the initial packet is truncated. From that point on the data is fine,
unless the IP masq timeout is reached. Then once again the truncation will
repeat itself on the first packet.

All the above testing is with 2.2.9, but I've seen this problem since at
least the 2.0.36pre's.

===================================================================
>
> > Ever since 2.0.36pre, I've began seeing "Connection reset by peer"
> > errors when I would ssh in to my servers. It evetually got worse
> > to where when I first check my POP3 server, I get it.
> >
> > Can anyone enlighten me?

An important point I forgot to mention (DOH!)....I see this only when bouncing
off an internal router.

Linux Server1
208.225.71.1 eth0...internet
192.168.0.1 eth1

Linux Server2
208.225.71.2 eth0
192.168.0.2 eth1
route -host 208.225.71.1 gw 192.168.0.1

Workstation(s)
192.168.0.0 network
route default gw 192.168.0.2

In the above all workstation requests for 208.225.71.1 (server 1 eth0)
will be bounced off of server 2, since server 2 is their default route.
That is where I see the problem. If a workstation requests
208.225.71.2, all is well. If it requests 208.225.71.1, (and gets bounced)
depending on the services and client OS, initially it will error with
"Connection reset by peer". Subsequent 'bounces' will be fine, unless
the communications are ceased for a few minutes (10? 30?) it will again give
and initial "Connection reset by peer".

Linux 2.0 and 2.2:
SSH, SCP (2 in a row), and POP3 (qpopper).

Solaris 7 x86:
http (Netscape), telnet (SetSockOpt: invalid argument).

I'm sure there are other services I havent't list that it is a problem.
Under Linux Netscape is NOT a problem.

Yes I know idealy it can be set up so the router bounce is not needed.
But is there actually some underlying problem with the Linux code?
(forwarding, firewall, or ipmasq?) From syslog, qpopper's failures
look like this:

May 27 20:20:08 schizo in.qpopper[32335]: @zen-machine.psychosis.com: -ERR Too
few arguments for the auth command.

Could the initial packets data be getting it's head or tail cut off?
===================================================================

-- 
http://www.linuxrouter.org/     Linux Router Project

marshall.us-state.gov - - [26/Apr/1999:09:27:40 -0400] "GET /robots.txt HTTP/1.0" 302 211

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/