Re: Seeking Linux watchdog design advice to trouble shoot mystorysilent reboot issue

From: Vincent Li
Date: Sat Dec 17 2011 - 17:50:53 EST


On Wed, Dec 14, 2011 at 2:11 PM, Linus Walleij <linus.walleij@xxxxxxxxxx> wrote:
> On Mon, Dec 5, 2011 at 8:55 PM, Vincent Li <vincent.mc.li@xxxxxxxxx> wrote:
>
>> we have  a complex system with a large number of processes running
>> simutanously. If any of the processes gets into a faulty state and
>> hangs or consumes more than its fair share of the system resources,
>> the other processes may not get a chance to run, and the whole system
>> can hang, interrupting the system functionality and user traffic.
>
> Have you tried using RLIMITs?
>
> Last time I used something like this from each process:
>
> #include <sys/time.h>
> #include <sys/resource.h>
>
> struct rlimit rl;
> int ret;
>
> // No process run more than 5 seconds
> rl.rlim_cur = rl.rlim_max = 5;
> ret = setrlimit(RLIMIT_CPU, &rl);
> // No realtime process run more than 1 second
> rl.rlim_cur = rl.rlim_max = 1000000;
> ret = setrlimit(RLIMIT_RTTIME, &rl);
>
> The latter is good if you have real-time processes.
>
> There are also RLIMITs for memory consumption.
>
> Consult:
> http://kernel.org/doc/man-pages/online/pages/man2/getrlimit.2.html
>

thank you for the link, I will look into it.

Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/