Re: SMP Theory (was: Re: Interesting analysis of linux kernel threading by IBM)

From: Jeff V. Merkey (jmerkey@timpanogas.com)
Date: Tue Jan 25 2000 - 14:52:47 EST


How do you do fault tolerance on Shared-Everything (like what you
describe)? Sounds like COMA or SCI? Most folks dismiss
shared-everything architectures as non-resilent (Intel and Microsoft
have traditionally been shared-nothing bigots on this point).

Jeff

Iain McClatchie wrote:
>
> One of the problems with this forum is that you can't hear the murmur
> of assent ripple through the hardware design crowd when Larry rants
> about this stuff. Larry has had his head out of the box for a long
> time.
>
> Look at the ASCI project. The intention was for SGI to build an
> Origin with around 1000 CPUs. That Origin had extra cache coherence
> directory RAM and special encodings in that RAM so that the hardware
> could actually keep the memory across all 1000 CPUs coherent. We
> added extra physical address bits to the R10K to make this machine
> possible.
>
> Last I heard, the machine is mostly programmed with message passing.
>
> I remember having a talk with an O/S guy who was implementing some
> sort of message delivery utility inside the O/S. This was when
> Cellular IRIX was in development, and they were investigating having
> the various O/S images talk to each other with messages across the
> shared memory. Then someone found out the O/S images could signal
> each other FASTER through the HIPPI connections than they could
> through shared memory. That is, this machine had a HIPPI port local
> to each O/S image, and all those HIPPI ports were connected together
> via a HIPPI switch.
>
> Those HIPPI connections were build with the _same_physical_link_ as
> the shared memory - an 800 MB/s source-synchronous channel. But if
> you're sending a message, it's better to have the I/O system just
> send the bits one way than have the shared memory system do two round
> trips, one to invalidate the mailbox buffer for writing and another to
> process the remote cache miss to receive the message.
>
> -Iain McClatchie
> www.10xinc.com
> iain@10xinc.com
> 650-364-0520 voice
> 650-364-0530 FAX
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jan 31 2000 - 21:00:15 EST