Re: The IBM order relaxation patch

From: David S. Miller (davem@redhat.com)
Date: Wed Feb 06 2002 - 23:01:00 EST


   From: Alan Cox <alan@lxorguk.ukuu.org.uk>
   Date: Thu, 7 Feb 2002 00:18:47 +0000 (GMT)

> >with a GPF flag? What they describe does not happen in an
> >interrupt context, so we can sleep.
>
> Because nobody even *tries* to free adjacent pages to build up
> a free order-2 area. You could wait really long ...
   
   Without the rmap patch you can't easily do it
   
One change from Rik's VM (both of them, the 2.4.9 based AC stuff and
RMAP) and the current stuff in Linus's tree is that order 2 and
smaller are treated all equally.

This got rid of a lot of problems on Sparc64 and with AF_UNIX sockets
for example. Sparc64 has the same issue as the IBM patch is trying
to solve, we need order 1 pages for our page table allocations. And
AF_UNIX was trying to use large linear buffers for better performance
during bulk transfers.

Btw, the AF_UNIX side of this results in all kinds of MYSQL
performance problems, or at least this is how I remember it.
(for more details on this grep for SKB_MAX_ALLOC in current
 2.4.x/2.5.x sources, in particular the references in
 include/linux/skbuff.h and net/unix/af_unix.c)

There was even a linux-kernel thread about all of this back in
the 2.4.{13,14,15} days, perhaps someone can find it on
marc.theaimsgroup.com

I do not think the Linus VM behavior is unreasonable, which basically
amounts to continually trying to free pages for all order 3 and below
allocations (if you can sleep and you aren't PF_MEMALLOC etc.).

> rmap method could help here, because with reverse mappings we
> can at least try to free adjacent areas (because we then at least
> *know* who's using the pages).
   
   rmap definitely makes it a real no brainer to do this at least for small
   clusters of pages. Doing large chunks gets progressively harder

You just have to be careful that you don't let the algorithm
degenerate into a dumb scan, which is the kind of silly stuff
the VM used to do back in the pre-2.2.x days :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:57 EST