Re: SMP/cc Cluster description

From: Larry McVoy (lm@bitmover.com)
Date: Tue Dec 04 2001 - 22:23:17 EST

Next message: Nathan Bryant: "Re: i810 audio patch"
Previous message: Mike Fedyk: "Re: [kbuild-devel] Converting the 2.5 kernel to kbuild 2.5"
In reply to: David S. Miller: "Re: SMP/cc Cluster description"
Next in thread: David S. Miller: "Re: SMP/cc Cluster description"
Reply: David S. Miller: "Re: SMP/cc Cluster description"
Reply: Momchil Velikov: "Re: SMP/cc Cluster description"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Dec 04, 2001 at 06:36:01PM -0800, David S. Miller wrote:
> From: Larry McVoy <lm@bitmover.com>
> Date: Tue, 4 Dec 2001 16:36:46 -0800
>
> OK, so start throwing stones at this. Once we have a memory model that
> works, I'll go through the process model.
>
> What is the difference between your messages and spin locks?
> Both seem to shuffle between cpus anytime anything interesting
> happens.
>
> In the spinlock case, I can thread out the locks in the page cache
> hash table so that the shuffling is reduced. In the message case, I
> always have to talk to someone.

Two points:

a) if you haven't already, go read the Solaris doors paper. Now think about
   a cross OS instead of a cross address space doors call. They are very
   similar. Think about a TLB shootdown, it's sort of a doors style cross
   call already, not exactly the same, but do you see what I mean?

b) I am not claiming that you will have less threading in the page cache.
   I suspect it will be virtually identical except that in my case 90%+
   of the threading is in the ccCluster file system, not the generic code.
   I do not want to get into a "my threading is better than your threading"
   discussion. I'll even go so far as to say that this approach will have
   more threading if that makes you happy, as long as we agree it is outside
   of the generic part of the kernel. So unless I have CONFIG_CC_CLUSTER,
   you pay nothing or very, very little.

   Where this approach wins big is everywhere except the page cache. Every
   single data structure in the system becomes N-way more parallel -- with
   no additional locks -- when you boot up N instances of the OS. That's
   a powerful statement. Sure, I freely admit that you'll add a few locks
   to deal with cross OS synchronization but all of those are configed out
   unless you are running a ccCluster. There is absolutely zero chance that
   you can get the same level of scaling with the same number of locks using
   the traditional approach. I'll go out on a limb and predict it is at
   least 100x as many locks. Think about it, there are a lot of things that
   are threaded simply because of some corner case that happens to show up
   in some benchmark or workload. In the ccCluster case, those by and large
   go away. Less locks, less deadlocks, less memory barriers, more reliable,
   less filling, tastes great and Mikey likes it :-)

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: Nathan Bryant: "Re: i810 audio patch"
Previous message: Mike Fedyk: "Re: [kbuild-devel] Converting the 2.5 kernel to kbuild 2.5"
In reply to: David S. Miller: "Re: SMP/cc Cluster description"
Next in thread: David S. Miller: "Re: SMP/cc Cluster description"
Reply: David S. Miller: "Re: SMP/cc Cluster description"
Reply: Momchil Velikov: "Re: SMP/cc Cluster description"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Dec 07 2001 - 21:00:27 EST