Re: [fixed] [patch] Re: [bug] stuck localhost TCP connections,v2.6.26-rc3+

From: Ilpo Järvinen
Date: Tue Jun 03 2008 - 19:22:39 EST


On Tue, 3 Jun 2008, David Miller wrote:

> From: "Ilpo_Järvinen" <ilpo.jarvinen@xxxxxxxxxxx>
> Date: Wed, 4 Jun 2008 01:01:25 +0300 (EEST)
>
> > On Wed, 4 Jun 2008, Ilpo Järvinen wrote:
> >
> > > ...I couldn't immediately find anything obviously wrong with those changes
> > > but the patch below might be worth of a try (without the revert of
> > > course). If it ever spits out that WARN_ON for you, we were playing with
> > > fire too much and it's better to return on the safe side there...
> >
> >
> > > [PATCH] tcp DEFER_ACCEPT: see if header prediction got turned on
> > >
> > > If header prediction is turned on under some circumstances,
> > > DA can deadlock though I have great trouble in figuring out
> >
> > ...Nah, keepalive timer would then eventually kill it then, so no
> > deadlock seems possible through that one.
>
> Keepalive is very long, it might still "seem" like a deadlock for
> someone without much patience :-)

I think we want that clearing there, it's better to be safe than sorry
there and to not put any trust on the keepalive thingie which tears down
rather than results in a connection.

But here's somewhat more likely explanation... Only compile tested...
It probably needs some commenting from people who understand locking
variants & details (I don't).


--
i.


--
[PATCH] tcp DEFER_ACCEPT: fix racy access to listen_sk

It seems that replacement of DA code also moved parts outside
of appropriate locking. The Ingo's problem seems to come from
the fact that two flows could now race in
(inet_csk_)reqsk_queue_add corrupting the queue. ...This can
leave dangling socks around which won't resolve themselves
without stimuli from outside (e.g., external RST would help
I think).

Then some details I'm not too sure of:
I guess we want to put listen_sk->sk_state checking under the
lock as well. I've not evaluated if ->sk_data_ready too
requires locking but assumed it does.

I'm by no means familiar with all locking variants, requirements,
etc.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx>
---
net/ipv4/tcp_input.c | 23 +++++++++++++----------
1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c9454f0..d21d2b9 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4562,6 +4562,7 @@ static int tcp_defer_accept_check(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);

if (tp->defer_tcp_accept.request) {
+ struct sock *listen_sk = tp->defer_tcp_accept.listen_sk;
int queued_data = tp->rcv_nxt - tp->copied_seq;
int hasfin = !skb_queue_empty(&sk->sk_receive_queue) ?
tcp_hdr((struct sk_buff *)
@@ -4570,8 +4571,9 @@ static int tcp_defer_accept_check(struct sock *sk)
if (queued_data && hasfin)
queued_data--;

- if (queued_data &&
- tp->defer_tcp_accept.listen_sk->sk_state == TCP_LISTEN) {
+ bh_lock_sock(listen_sk);
+
+ if (queued_data && listen_sk->sk_state == TCP_LISTEN) {
if (sock_flag(sk, SOCK_KEEPOPEN)) {
inet_csk_reset_keepalive_timer(sk,
keepalive_time_when(tp));
@@ -4579,23 +4581,24 @@ static int tcp_defer_accept_check(struct sock *sk)
inet_csk_delete_keepalive_timer(sk);
}

- inet_csk_reqsk_queue_add(
- tp->defer_tcp_accept.listen_sk,
- tp->defer_tcp_accept.request,
- sk);
+ inet_csk_reqsk_queue_add(listen_sk,
+ tp->defer_tcp_accept.request,
+ sk);

tp->defer_tcp_accept.listen_sk->sk_data_ready(
- tp->defer_tcp_accept.listen_sk, 0);
+ listen_sk, 0);

- sock_put(tp->defer_tcp_accept.listen_sk);
+ sock_put(listen_sk);
sock_put(sk);
tp->defer_tcp_accept.listen_sk = NULL;
tp->defer_tcp_accept.request = NULL;
- } else if (hasfin ||
- tp->defer_tcp_accept.listen_sk->sk_state != TCP_LISTEN) {
+ } else if (hasfin || listen_sk->sk_state != TCP_LISTEN) {
+ bh_unlock_sock(listen_sk);
tcp_reset(sk);
return -1;
}
+
+ bh_unlock_sock(listen_sk);
}
return 0;
}
--
1.5.2.2