Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local memory and limit readahead pages

From: Nishanth Aravamudan
Date: Mon Feb 17 2014 - 14:28:20 EST


On 14.02.2014 [02:54:06 -0800], David Rientjes wrote:
> On Thu, 13 Feb 2014, Nishanth Aravamudan wrote:
>
> > There is an open issue on powerpc with memoryless nodes (inasmuch as we
> > can have them, but the kernel doesn't support it properly). There is a
> > separate discussion going on on linuxppc-dev about what is necessary for
> > CONFIG_HAVE_MEMORYLESS_NODES to be supported.
> >
>
> Yeah, and this is causing problems with the slub allocator as well.
>
> > Apologies for hijacking the thread, my comments below were purely about
> > the memoryless node support, not about readahead specifically.
> >
>
> Neither you nor Raghavendra have any reason to apologize to anybody.
> Memoryless node support on powerpc isn't working very well right now and
> you're trying to fix it, that fix is needed both in this thread and in
> your fixes for slub. It's great to see both of you working hard on your
> platform to make it work the best.
>
> I think what you'll need to do in addition to your
> CONFIG_HAVE_MEMORYLESS_NODE fix, which is obviously needed, is to enable
> CONFIG_USE_PERCPU_NUMA_NODE_ID for the same NUMA configurations and then
> use set_numa_node() or set_cpu_numa_node() to properly store the mapping
> between cpu and node rather than numa_cpu_lookup_table. Then you should
> be able to do away with your own implementation of cpu_to_node().
>
> After that, I think it should be as simple as doing
>
> set_numa_node(cpu_to_node(cpu));
> set_numa_mem(local_memory_node(cpu_to_node(cpu)));
>
> probably before taking vector_lock in smp_callin(). The cpu-to-node
> mapping should be done much earlier in boot while the nodes are being
> initialized, I don't think there should be any problem there.

vector_lock/smp_callin are ia64 specific things, I believe? I think the
equivalent is just in start_secondary() for powerpc? (which in fact is
what calls smp_callin on powerpc).

Here is what I'm running into now:

setup_arch ->
do_init_bootmem ->
cpu_numa_callback ->
numa_setup_cpu ->
map_cpu_to_node ->
update_numa_cpu_lookup_table

Which current updates the powerpc specific numa_cpu_lookup_table. I
would like to update that function to use set_cpu_numa_node() and
set_cpu_numa_mem(), but local_memory_node() is not yet functional
because build_all_zonelists is called later in start_kernel. Would it
make sense for first_zones_zonelist() to return NUMA_NO_NODE if we
don't have a zone?

Thanks,
Nish

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/