Re: [PATCH] per-zone kswapd process

From: Martin J. Bligh (mbligh@aracnet.com)
Date: Fri Sep 13 2002 - 00:10:37 EST


>> Also, I'm wondering why the individual kernel threads don't have
>> their affinity masks set to make them run on the CPUs to which the
>> zone (or zones) are local?
>> Isn't it the case that with this code you could end up with a kswapd
>> on node 0 crunching on node 1's pages while a kswapd on node 1 crunches
>> on node 0's pages?
>
> Without some architecture-neutral method of topology detection, there's
> no way to do this. A follow-up when it's there should fix it.

Every discontigmem arch should implement cpu_to_node, with a generic
fallback mechanism that returns 0 or something. Not that we do right
now, but that's easy to fix. There should also be a node_to_cpus call
that returns a bitmask of which cpus are in that node.

Matt ... want to sneak in the first bit of the topology patch, or
whatever lump this fell under? Seems like an appropriate juncture.
We have the code already somewhere, just need to fish it out.

>> If I'm not totally out to lunch on this, I'd have thought that a
>> better approach would be
>> int sys_kswapd(int nid)
>> {
>> return kernel_thread(kswapd, ...);
>> }
>> Userspace could then set up the CPU affinity based on some topology
>> or config information and would then parent a kswapd instance. That
>> kswapd instance would then be bound to the CPUs which were on the
>> node identified by `nid'.
>> Or something like that?
>
> I'm very very scared of handing things like that to userspace, largely
> because I don't trust userspace at all.
>
> At this point, we need to enumerate nodes and provide a cpu to node
> correspondence to userspace, and the kernel can obey, aside from the
> question of "What do we do if we need to scan a node without a kswapd
> started yet?". I think mbligh recently got the long-needed arch code in
> for cpu to node... But I'm just not able to make the leap of faith that
> memory detection is something that can ever comfortably be given to
> userspace.

I don't think the userspace stuff is necessary - we can do this all
in the kernel dead easily I think. Just need a couple of definitions,
which are trivially small functions.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:32 EST