Re: [KERNEL BUG] do_timer/tick_handover_do_timer 3.10.17

From: Mike Galbraith
Date: Fri May 08 2015 - 01:12:18 EST


On Fri, 2015-05-08 at 04:16 +0000, Oza (Pawandeep) Oza wrote:
> So Mike, is this reason strong enough for you ?

Nope. I think you did the right thing in removing your dependency on
jiffies reliability in a dying box. You don't have to convince me of
anything though, CC timer subsystem maintainer, see what he says.

> I understand your point: solve the BUG, and I do tend to agree with you.
>
> But by design and implementation, the BUG() is just a beginning of the end for dying kernel.
> And what happens in between this 'the beginning' and 'the end' is not less important.
> (because say, on our platform we want to get clean RAMDUMP to analyze what happened, and for that we want to get clean reboot)

I don't see anybody else having any trouble getting crash dumps. I
spent yet another long day just yesterday, rummaging through one.

> Also,
> If somebody's design is to legally Crash the kernel (e.g. where kernel is actually not faulty).
> Then, I do expect that tick/timekeeping framework do its job as long as it can do, and it should do, because kernel is not faulty.
> But in this case it doesnât handover jiffies incrementing job sanely.

It seems odd to me to use BUG() for what you appear to be using it for..
not that I know exactly what that it mind you, but when you said when
some other gizmo in your box has a problem you crash the kernel, my head
tilted to the side - surely there's a more controlled response possible
than poking the big red self destruct button ;-)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/