2.2.15: TCP connect()+poll() broken on SMP?

From: Peter Hartley (peter@empeg.com)
Date: Tue Jul 18 2000 - 08:47:44 EST


I've got an application that does a blocking connect() to a TCP server in
the normal manner, then immediately calls poll(), on just that one fd, to
test for readability.

On SMP machines, the poll occasionally (1 in 5 times) says the socket *is*
readable (POLLIN+POLLRDNORM), even if I've pointed the program to a port
whose service isn't initally readable (e.g. HTTP). A second call to poll()
immediately afterwards correctly says that the socket isn't readable. And
of course a blocking read() blocks.

This problem never seems to occur on uniprocessors, nor if the service I'm
connecting to is on the local host.

Data points are:
  P2/450 x2, kernel 2.2.12: sometimes fails
  P2/400 x2, kernel 2.2.15: sometimes fails
  P3/800 x2, kernel 2.2.15+reiserfs: sometimes fails
  Pentium/133 x1, 2.2.15: never fails
  P2/600 x1, 2.2.15: never fails
  Celeron/500 x1, 2.2.15: never fails

I haven't tried it with 2.2.16, 2.2.17preX or 2.4.* on any of the dual-CPU
machines as they're all production servers :-( but there didn't look to be
anything relevant in the 2.2.15-2.2.16 patch.

CC replies to me please if poss as I'm not subscribed. Also mail me if you
want a test program, but all it does is go gethostbyname_r+socket+connect+
poll+poll in what appears to be a perfectly standard fashion.

        Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 23 2000 - 21:00:11 EST