Re: [Fastboot] [PATCH] kdump: add a missing notifier beforecrashing

From: Akiyama, Nobuyuki
Date: Mon Jun 19 2006 - 03:28:12 EST


On Fri, 16 Jun 2006 10:37:05 -0600
ebiederm@xxxxxxxxxxxx (Eric W. Biederman) wrote:

> > The processing of the notifier is to make a SCSI adaptor power off to
> > stop writing in the shared disk completely and then notify to standby-node.
>
> The kernel has called panic no new SCSI operations were execute.
> I'm not saying don't notify your standby-node

As you say, the kernel does not do anything about SCSI operations.
But many SCSI adaptors flush their cache after a few seconds pass
after a SCSI write command is invoked, especially RAID cards.
To completely stop writing immediately, we should make the adaptor
power off.

> Please walk me through a real world kernel failure, and show me how
> your millisecond requirement is met.
>
> In the example please answer:
> - What causes the kernel to call panic?
> - From the real failure to the kernel calling panic how long
> does it take?

For instance, if a file system inconsistency is detected,
it takes few time until invoking panic.
I have seen various kernel failure so far and these will
unfortunately occur.

> - What actions does the notifier take to tell the other kernel
> it is dead.

The operation is only writing to BMC a few times to use IPMI
interface. That operation using outb is very simple.

> - Why do we think the kernel taking that action will be reliable?

I agree the notifier may spoil reliability as compared with doing
nothing. It depends on quality of the notifier processing.
But I think the one is needed because it is more effective.

> - From the point where we call panic() how long does it take until
> the kdump kernel is active?

On my box it takes about one second or so, but on a actual enterprise
system which have many disks(hundreds or more) it becomes more.

Thanks,

--
Akiyama, Nobuyuki

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/