[patch] sched, net: fix scheduling latencies in netstat

From: Ingo Molnar
Date: Tue Sep 14 2004 - 05:26:11 EST



the attached patch fixes long scheduling latencies caused by access to
the /proc/net/tcp file. The seqfile functions keep softirqs disabled for
a very long time (i've seen reports of 20+ msecs, if there are enough
sockets in the system). With the attached patch it's below 100 usecs.

the cond_resched_softirq() relies on the implicit knowledge that this
code executes in process context and runs with softirqs disabled.

potentially enabling softirqs means that the socket list might change
between buckets - but this is not an issue since seqfiles have a 4K
iteration granularity anyway and /proc/net/tcp is often (much) larger
than that.

This patch has been in the -VP patchset for weeks.

Ingo

the attached patch fixes long scheduling latencies caused by access to
the /proc/net/tcp file. The seqfile functions keep softirqs disabled for
a very long time (i've seen reports of 20+ msecs, if there are enough
sockets in the system). With the attached patch it's below 100 usecs.

the cond_resched_softirq() relies on the implicit knowledge that this
code executes in process context and runs with softirqs disabled.

potentially enabling softirqs means that the socket list might change
between buckets - but this is not an issue since seqfiles have a 4K
iteration granularity anyway and /proc/net/tcp is often (much) larger
than that.

This patch has been in the -VP patchset for weeks.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

--- linux/net/ipv4/tcp_ipv4.c.orig
+++ linux/net/ipv4/tcp_ipv4.c
@@ -2227,7 +2227,10 @@ static void *established_get_first(struc
struct sock *sk;
struct hlist_node *node;
struct tcp_tw_bucket *tw;
-
+
+ /* We can reschedule _before_ having picked the target: */
+ cond_resched_softirq();
+
read_lock(&tcp_ehash[st->bucket].lock);
sk_for_each(sk, node, &tcp_ehash[st->bucket].chain) {
if (sk->sk_family != st->family) {
@@ -2274,6 +2277,10 @@ get_tw:
}
read_unlock(&tcp_ehash[st->bucket].lock);
st->state = TCP_SEQ_STATE_ESTABLISHED;
+
+ /* We can reschedule between buckets: */
+ cond_resched_softirq();
+
if (++st->bucket < tcp_ehash_size) {
read_lock(&tcp_ehash[st->bucket].lock);
sk = sk_head(&tcp_ehash[st->bucket].chain);