Re: [RFC PATCHv2 1/2] drivers: power: Add watchdog timer to catchdrivers which lockup during suspend/resume.

From: Zoran Markovic
Date: Wed Jun 05 2013 - 18:18:07 EST


Rafael,

>>> We could do cancel_work_sync() as a recovery, but that call blocks until the
>>> running async task is flushed, which might never happen. So doing a panic()
>>> is pretty much the only option for recovering.
>>
>> Well, its usefulness is quite limited, then. That said I'm still not convinced
>> that this actually is the case.
>
> It does block in my environment, AFAICS. Looking a bit further in the
> code, it looks like dpm_suspend() does an async_synchronize_full()
> which would wait for all async tasks to complete. This is a
> show-stopper because (under the circumstances) the assumption that
> every async suspend routine eventually completes doesn't hold.
>
> We could possibly select which async tasks to wait for, but this would
> add unnecessary complexity to a feature targeted for debugging. It
> seems that this approach - although sounding reasonable - needs to
> wait until we have a mechanism to cancel an async task.

Looks like the implementation of proposal for an async suspend +
wait_for_completion_timeout is quite complex due to above limitations.
How do we proceed from here? We have the following options:
1. Give up on the idea of having a suspend/resume watchdog.
2. Use the timer implementation (with possible modifications).
3. Wait for the implementation of (or implement) killing of an already
running async work.

Are there any other ideas floating around?

Thanks,
Zoran
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/