So.. Alan.. :) Very few people know squat about SMP, but it sure would be
cool if Linux could restart a dead cpu when in SMP mode.. I dont know how
much impact it would have on real world fault tolerance, but I dont think
the overhead of supporting would be large.. And in this situation it would
be VERY approiate..
On Sun, 9 Nov 1997, Alan Cox wrote:
> > First youd have to toy with the apic so that somehow the working cpu would
> > eventually get a interupt (RTC trick maby?). (MABY NOT POSSIBLE)
>
> On non 486 boxes in 2.1.x the local APIC is already the interrupt source
> for timer interrupts
But what cpu do they send them too? (I'm assuming they send them to the
next cpu that needs to be scheduled, and if that interupt gets lost then
the computer freezez) There needs to be an alternate interupt that will go
to the other cpu..
> > Then it would have to figure out that the other cpu isn't working. (LIKELY
> > EASY)
>
> Very
>
> > Then it would have to reboot and reinitlize the other cpu. (VERY HARD)
>
> Actually that bit is fairly easy if the CPU still responds to reset - you
> can ask the APIC to reset the other processor. The problem is you would
> need to know exactly what the other processor was doing at the time and
> reverse it. Some fault tolerant systems do just that.
>
> Alan
>
Well, I'm not familar with how the SMP stuff inits cpus.. I though it
would be hard because the cpu might not even reset.. Further, you dont
know if it does other bad things, like scribbling on memory..