Re: drm/mgag200: doesn't work in panic context

From: Daniel Vetter
Date: Tue Jun 30 2015 - 02:37:09 EST


On Tue, Jun 30, 2015 at 4:53 AM, Rui Wang <rui.y.wang@xxxxxxxxx> wrote:
> On Monday, June 29, 2015 5:25 PM, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote:
>> As long as the display is up and running we should have a fair stab at
>> showing the oops - it's just that no one has seriously bothered with
>> the necessary infastructure, automated testing (it won't work
>> otherwise) and driver work.
>
> I think testing can be done by injecting a fatal machine check exception
> via einj's debugfs interface. I can reproduce the hard hang every time.
> I think It can be a simple script or C program do to the automated testing.
> If anyone has any patch I'll be happy to help test it out.

Testing shouldn't kill the machine ;-)

The idea I had is to just exercise the drm panic code (since we'd need
to shunt everything else), and that can be done my calling the
relevant functions from a hardirq context. And hardirq context is
simples to get with a IPI to the local cpu. This way we don't depend
upon the entire panic path to be recoverable, but only upon the drm
bits being sane.

The other thing that needs testing is pushing all the fbdev callbacks
into workers if run from non-process context. But that's something
fbdev likes doing anyway (it's just that most drivers don't need to
overwrite the hooks where this usually happens).
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/