Re: [PATCH 0/5] improve handling of errors returned by kthread_park()

From: Andrew Morton
Date: Tue Sep 29 2015 - 19:30:43 EST


On Mon, 28 Sep 2015 22:44:07 +0200 Ulrich Obergfell <uobergfe@xxxxxxxxxx> wrote:

> The original watchdog_park_threads() function that was introduced by
> commit 81a4beef91ba4a9e8ad6054ca9933dff7e25ff28 takes a very simple
> approach to handle errors returned by kthread_park(): It attempts to
> roll back all watchdog threads to the unparked state. However, this
> may be undesired behaviour from the perspective of the caller which
> may want to handle errors as appropriate in its specific context.
> Currently, there are two possible call chains:
>
> - watchdog suspend/resume interface
>
> lockup_detector_suspend
> watchdog_park_threads
>
> - write to parameters in /proc/sys/kernel
>
> proc_watchdog_update
> watchdog_enable_all_cpus
> update_watchdog_all_cpus
> watchdog_park_threads
>
> Instead of 'blindly' attempting to unpark the watchdog threads if a
> kthread_park() call fails, the new approach is to disable the lockup
> detectors in the above call chains. Failure becomes visible to the
> user as follows:
>
> - error messages from lockup_detector_suspend()
> or watchdog_enable_all_cpus()
>
> - the state that can be read from /proc/sys/kernel/watchdog_enabled
>
> - the 'write' system call in the latter call chain returns an error
>

hm, you made me look at kthread parking. Why does it exist? What is a
"parked" thread anyway, and how does it differ from, say, a sleeping
one? The 2a1d446019f9a5983ec5a335b changelog is pretty useless and the
patch added no useful documentation, sigh.

Anwyay... what inspired this patchset? Are you experiencing
kthread_park() failures in practice? If so, what is causing them? And
what is the user-visible effect of these failures? This is all pretty
important context for such a patchset.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/