2.2.16 - bad networking/slab/VM interaction - proposed fix(es)

From: Mark Hemment (Mark_Hemment@eur.3com.com)
Date: Thu Jun 15 2000 - 10:24:40 EST


Hi,

  I've a box here which "freezes" under high networking/disk load.

  The freeze is caused by a process looping in tcp_do_sendmsg(). The call to
sock_wmalloc()
fails as alloc_skb() fails. tcp_do_sendmsg() then calls wait_for_tcp_memory(),
but as
"sk->wmem_alloc < sk->sndbuf" (ie. space enough to grow; not reached the
send-buffer max) it
returns immediately and loops round in tcp_do_sendmsg() for another call to
sock_wmalloc().

  All the free pages are single pages, which is why alloc_skb() fails (the
allocation is coming from
a slab cache where the page order is greater than 0 and the slab cache has no
free objects).
  The number of free pages is greater than "(freepages.low+freepages.low)/2",
and no-one
else is allocating, so try_to_free_pages() is not called from
__get_free_pages(). ie. no attempt to
make progress on releasing memory.
  "kswapd" has been woken up (free pages is less than freepages.high, so
__get_free_pages())
eventually does this (or its schedule_timeout() expired), but as a process is
"spinning" on the
CPU it doesn't get chance to be scheduled.

  OK, that is the description. How what is the fix?

  The slab allocator shouldn't be using high page orders. As the author of that
piece of code, the
fault is mine. I've a new allocator in development
(http://www.nextd.demon.co.uk) which tries hard
not to use non-zero page orders. That isn't ready yet.

  Should __get_free_pages() try to make some progress in releasing pages if an
allocation fails?
  Maybe. The allocation which is causing the problem here is at GFP_KERNEL, so
it should be
possible to get a few pages...

  Should wait_for_tcp_memory() check "current->need_resched"?
  I'd say yes.

So, for now, would a "need_resched" fix be acceptable?

Another quick point;
In kswapd(), it is possible for the kswapd daemon to drop down to the
schedule_timeout().
After it wakes up (eventually, 10secs is a long time!), should it zero the
"failed" counter?

Mark

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 23 2000 - 21:00:11 EST