Re: [patch 11/29] lockup_detector: Remove park_in_progress hackery

From: Don Zickus
Date: Tue Sep 05 2017 - 11:15:47 EST


On Mon, Sep 04, 2017 at 02:10:50PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 04, 2017 at 01:09:06PM +0200, Ulrich Obergfell wrote:
>
> > - A thread hogs CPU N (soft lockup) so that watchdog/N is unable to run.
> > - A user re-configures 'watchdog_thresh' on the fly. The reconfiguration
> > requires parking/unparking of all watchdog threads.
>
> This is where you fail, its silly to require parking for
> reconfiguration.

Hi Peter,

Ok, please elaborate. Unless I am misunderstanding, that is what Thomas
requested us do years ago when he implemented the parking/unparking scheme
and what his current patch set is doing now.

The point of parking I believe was to avoid the overhead of tearing down a
thread and restarting it when the code needed to update various lockup
detector settings.

So if we can't depend on parking for reconfiguration, then are the other
options (besides tearing down threads)?

I am not trying to be argumentative here, just trying to fill in the
disconnect between us.


Hi Uli,

I think the race you detailed is solved with Thomas's patches. In the
original design we set the sample period first, then tried parking the
threads, which created the mess. With this patchset, Thomas properly
parks the threads first, then sets the sample period, thus avoiding the race
I believe. You should be able to see that in patch 16,
softlockup_reconfigure_threads().

Cheers,
Don