Re: drm/nouveau: crash regression in 3.5

From: Maarten Lankhorst
Date: Mon Jul 30 2012 - 04:46:33 EST


Hey,

Op 29-07-12 22:15, Marcin Slusarz schreef:
> On Thu, Jul 26, 2012 at 02:56:22PM +0200, Ortwin GlÃck wrote:
>> On 25.07.2012 20:42, Marcin Slusarz wrote:
>>> Good, below patch should fix this panic.
>>>
>>> Note that you can hit an oops in drm_handle_vblank because patch from
>>> http://lists.freedesktop.org/archives/dri-devel/2012-May/023498.html
>>> has not been applied (yet?).
>> After applying your patch, it still crashes, although with a slightly
>> different stack trace. I then also applied the second patch, but that
>> doesn't make any difference. New log attached.
>>
>> Looks like interrupt occurs before nouveau_software_context_new() is
>> called? Shouldn't the initialization be done from
>> nouveau_irq_preinstall() so it is available when the irq occurs? Again,
>> I am not an expert here. Just guessing...
> No, the real problem is: with "noaccel" we don't register "software engine",
> but vblank ISR relies on its existance and happily derefences NULL pointer.
>
> Now, this patch should fix it for real...
>
> ---
> From: Marcin Slusarz <marcin.slusarz@xxxxxxxxx>
> Subject: [PATCH] drm/nouveau: disable vblank interrupts before registering PDISPLAY ISR
>
> Currently, we register vblank IRQ handler and later we disable vblank
> interrupts. So, for the short amount of time, we rely on vblank ISR
> to operate correctly, even if vblank interrupts are never going to be
> used later.
>
> In "noaccel" case, software engine - which is used by vblank ISR - is not
> registered, so if vblank interrupt triggers in a wrong moment, we can hit
> NULL pointer dereference in nouveau_software_vblank.
>
> To fix it, disable vblank interrupts before registering PDISPLAY ISR.
>
> Reported-by: Ortwin GlÃck <odi@xxxxxx>
> Signed-off-by: Marcin Slusarz <marcin.slusarz@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx [3.5]
>
I can confirm this bug when I was working on adding d0 vblank, but it seems
to me like nv*_display_create would be a more logical place to put the disable
call. This will make it more clear that vblank is disabled before the irq is registered.

Also maybe add a WARN_ON(!nv_engine(dev, NVOBJ_ENGINE_SW)); in
nouveau_vblank_enable and/or return -EINVAL in that case?

If you add the modification to nouveau_vblank_enable:
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx>

~Maarten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/