Re: [Nouveau] [PATCH 09/17] drm/radeon: use common fence implementation for fences

From: Maarten Lankhorst
Date: Wed Jul 23 2014 - 03:52:25 EST


op 23-07-14 09:37, Christian KÃnig schreef:
> Am 23.07.2014 09:31, schrieb Daniel Vetter:
>> On Wed, Jul 23, 2014 at 9:26 AM, Christian KÃnig
>> <deathsimple@xxxxxxxxxxx> wrote:
>>> It's not a locking problem I'm talking about here. Radeons lockup handling
>>> kicks in when anything calls into the driver from the outside, if you have a
>>> fence wait function that's called from the outside but doesn't handle
>>> lockups you essentially rely on somebody else calling another radeon
>>> function for the lockup to be resolved.
>> So you don't have a timer in radeon that periodically checks whether
>> progress is still being made? That's the approach we're using in i915,
>> together with some tricks to kick any stuck waiters so that we can
>> reliably step in and grab locks for the reset.
>
> We tried this approach, but it didn't worked at all.
>
> I already considered trying it again because of the upcoming fence implementation, but reconsidering that when a driver is forced to change it's handling because of the fence implementation that's just another hint that there is something wrong here.
As far as I can tell it wouldn't need to be reworked for the fence implementation currently, only the moment you want to allow callers outside of radeon. :-)
Doing a GPU lockup recovery in the wait function would be messy even right now, you would hit a deadlock in ttm_bo_delayed_delete -> ttm_bo_cleanup_refs_and_unlock.

Regardless of the fence implementation, why would it be a good idea to do a full lockup recovery when some other driver is
calling your wait function? That doesn't seem to be a nice thing to do, so I think a timeout is the best error you could return here,
other drivers have to deal with that anyway.

~Maarten

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/