Re: [Fastboot] [PATCH] kdump: add a missing notifier beforecrashing

From: Akiyama, Nobuyuki
Date: Fri Jun 16 2006 - 08:14:00 EST


On Fri, 16 Jun 2006 00:28:08 -0600
ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:

> Please give a concrete example of a failure mode where this allows
> you to meet your timing constraint.
>
> I have yet to be convinced that this actually solves a real world
> problem.
>
> What is the cost of the notifier you wish to implement?
> What is your guarantee that the system won't have wasted seconds
> detecting it can't allocate memory or other cases?
>
> If we go this route the notifier should not be exported to modules.
> Only the most scrutinized of code paths should ever set this,
> and code like that should never be a module.
>
> The patchset that adds the notifier call needs to include the notifier
> so people can look and see how sane this is.
>
> So far what I have seen are hand waving arguments that failures
> that can never happen must be detected and reported within
> milliseconds to another machine in an unspecified manner. Your kernel
> startup times are asserted to be to large to do this from the next
> kernel, but the code to do so is sufficiently complicated you can't do
> this in the kexec code stub that runs before it starts your next
> kernel.
>
> I am sympathetic but this interface seems to set expectations that
> we can the impossible, and it still appears unnecessary to me.

As I mentioned many times, this notifier is very effective in the
clustering system. Actually the Red Hat's kernel has a notifier
at the same point like this patch. I know a real system that failover
of DB server is immediately done by using this notifier.
It takes much cost to keep consistency of transaction processing
on DB system. Therefore, to shorten down time, it is very important
to immediately know that the system dies. In a mission critical system,
millisecond order or less is demanded.
The processing of the notifier is to make a SCSI adaptor power off to
stop writing in the shared disk completely and then notify to standby-node.
But as you think, it is sure not to necessarily become this scenario.
For instance, if the kernel hangs, the failure can be detected only
by heart-beat. In this case the detection time becomes longer.

Anyway this notifier is very effective and important in the actual world.
Another example is as follows:

http://lists.osdl.org/pipermail/fastboot/2006-June/003028.html

Thanks,

Akiyama, Nobuyuki

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/