Re: [PATCH] stopmachine: add stopmachine_timeout v3

From: Hidetoshi Seto
Date: Wed Jul 16 2008 - 04:16:08 EST


Peter Zijlstra wrote:
> I really don't like this, it means the system is really screwed up and
> doesn't deserve to continue.

It can be said that after timeout we just back to previous state, where
machine already limp(=partially screwed up), but have some degree of
performance. We might be able to do some recovery, such as killing
process, restart or reset of subsystem and so on. Even if a CPU get
stuck, it might be possible to continue its service with remaining
CPUs, ex. assume there are 1024 CPUs total.
(I wish if we were able to force-reset such unstable CPU in future...)

I agree that there are much amount of situation where this feature is
not acceptable. But there would be others.

Thanks,
H.Seto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/