Re: [feature] automatically detect hung TASK_UNINTERRUPTIBLE tasks

From: Ingo Molnar
Date: Mon Dec 03 2007 - 05:24:30 EST



* Radoslaw Szkodzinski <lkml@xxxxxxxxxxxxxxxxxxxxxxx> wrote:

> > iirc TASK_KILLABLE fixed NFS only. While that's a good thing there
> > are unfortunately a lot more subsystems that would need the same
> > treatment.
>
> Yes, that's exactly why the patch is needed - to find the bugs and fix
> them. Otherwise you'll have problems finding some places to convert to
> TASK_KILLABLE.
>
> CIFS and similar have to be fixed - it tends to lock the app using it,
> in unkillable state.

Amen. I still have to see a single rational argument against this
debugging feature - and tons of arguments were listed in favor of it. So
let's just try and see what happens.

> > Yes let's break things first instead of looking at the implications
> > closely.
>
> Throwing _rare_ stack traces is not breakage. 120s
> task_uninterruptible in the usual case (no errors) is already broken -
> there are no sane loads that can invoke that IMO.
>
> A stack trace on x subsystem error is not that bad, especially as
> these are limited to 10 per session.

we could lower that limit to 1 per bootup - if they become annoying.
There's lots of flexibility in the code. Really, we should have done
this 10 years ago - it would have literally saved me many days of
debugging time combined, and i really have experience in identifying
such bad tasks. (and it would have sped up debugging in countless number
of instances when users were met with an uninterruptible task.)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/