Re: Pagecache: find_or_create_page does not call a proper pageallocator function

From: Christoph Lameter
Date: Tue Apr 24 2007 - 16:30:50 EST


On Tue, 24 Apr 2007, Hugh Dickins wrote:

> On Tue, 24 Apr 2007, Christoph Lameter wrote:
> > On Tue, 24 Apr 2007, Hugh Dickins wrote:
> >
> > > Or Christoph may prevail in persuading there's no such problem.
> >
> > This is pointless. NUMA allocations can only be controlled for the highest
> > zone. If we switch to a lower zone then we allocate on a different zone
> > than the user requested.
>
> Sorry, I'm not following you there. Perhaps your comment is saying
> that second patch is pointless, because it forces allocations elsewhere
> than the policy requires? Rather than commenting on the line you quote?

NUMA allocation can only be controlled for the highest zone. If you switch
to another zone then no control of the node is possible anymore.

> (When I said I didn't know if the patch would have unwanted side-effects
> I was rather wondering what happens when you ask the allocator to do
> some impossible combination.)

It falls back from HIGHMEM to NORMAL (on 32 bit numa that means node 0.
The result of your patch is that someone requiring an alloc on node 4 will
get memory on node 0).

> > If the system has both high memory and normal memory then only allocations
> > to highmemory are subject to memory policies etc etc. The block device
> > allocations would be in zone normal/dma and thus be exempt from NUMA
> > placement.
>
> Please just point us to the line where sys_move_pages enforces this.

It does not. It will happily move the pages into highmem. The filesystem
cannot expect the page to remain in the same zone without holding a
reference count. Swap may also move pages between zones. There is nothing
special in what page migration does here.

I would say that the filesystem is broke if it has such expectations
regardless of page migration.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/