Re: New kernel/resource.c

allbery@kf8nh.apk.net
Sat, 17 Jul 1999 13:34:01 -0400 (EDT)


On 17 Jul, Linus Torvalds wrote
+-----
| I'd be more convinced if you could show (for example) that you can
| actually simplify "vmalloc()" by using this to find empty holes. Or
| something like that.
+--->8

I'm not sure about the general case, but contemplate this:

We have a Sun Enterprise 63xx (not sure what exact model) with 8 CPUs
arranged in 4 banks of 2 CPUs, each with some amount of local memory;
access from a CPU to its local memory is, if I understand correctly,
cheaper than access to memory on another CPU bank (but I don't know if
main memory is composed entirely of these banks, or if each CPU bank
sees a memory map composed of its local memory, a page frame for
another CPU bank's memory, and a pool of shared memory. I'm no
memory architecture guru; but it sounds like a hybrid NUMA setup, with
standard SMP between CPUs on a single node and NUMA between nodes. Is
this anywhere near correct, or should I stick to sysadminning? :-)
There's also things like IBM's SP machines: I've "heard" (no idea
whether it's rumor or fact) IBM has Linux running on individual SP
nodes, but no way to make them work together the way they're supposed
to; this sounds like a vaguely similar situation, with some resources
being more "local" than others.

I don't think the current vmalloc() et al. can cope with this "nicely",
i.e. able to allocate memory from different pools with differing amounts
of "proximity" to a CPU in a common address space. For example,
currently we deal with ISA DMA-able memory using a GFP_DMA flag
somewhere along the line, effectively making it a special case which
can't easily be generalized to multiple kinds of regions.

What I'm envisioning is a resource tree which can describe a simple
flat address space, a flat address space with some "special" memory
(i.e. ISA DMA-able space on PCs), or a NUMA setup, etc.. vmalloc()
would become a compatibility wrapper for a more general alloc_space()
call which would be passed a tree node: if you don't really care which
pool the memory comes from you can pass a "higher" resource tree node,
if you need the memory on the E63xx's local CPU RAM or an ISA DMA-able
region, etc. you pass the resource node for that resource. There might
also be an option for "prefer closer resources" vs. "use whatever is
most readily available" (i.e. "prefer speed" vs. "prefer size", or maybe
"prefer resource speed" vs. "prefer allocation speed"?). Same goes for
resources other than memory.

In theory a node in the resource tree might include pointers to
allocation management functions, which could be inherited from higher
nodes when there's no need for special handling --- or the node's
allocation functions could refer to the parent's allocation functions
(like calling a superclass's method in a subclass method) when it needs
to modify the allocation scheme instead of overriding it.

Does this make any sense, or am I babbling (again)?

-- 
brandon s. allbery	   os/2,linux,solaris,perl	allbery@kf8nh.apk.net
system administrator	    kth-krb,heimdal,gnome	  allbery@ece.cmu.edu
carnegie mellon / electrical and computer engineering			kf8nh
    We are Linux. Resistance is an indication that you missed the point.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/