Re: fork: Resource temporarily unavailable / cant start new threads
From: mark
Date: Wed May 21 2008 - 18:52:40 EST
On Wed, May 21, 2008 at 2:32 PM, Randy Dunlap <randy.dunlap@xxxxxxxxxx> wrote:
> On Wed, 21 May 2008 14:08:53 -0700 mark wrote:
>
>> On Wed, May 21, 2008 at 1:50 PM, Randy Dunlap <randy.dunlap@xxxxxxxxxx> wrote:
>> > mark wrote:
>> >>
>> >> On Wed, May 21, 2008 at 1:28 PM, Randy Dunlap <randy.dunlap@xxxxxxxxxx>
>> >> wrote:
>> >>>
>> >>> On Tue, 20 May 2008 11:26:47 -0700 mark wrote:
>> >>>>
>> >>>> I upgraded to 2.6.25.3-18.fc9.x86_64 fedora core 9, now I get this
>> >>>> error when I try to login to the box, kill a pr start a python app, or
>> >>>> do anything on a regular basis.
>> >>>>
>> >>>> fork: Resource temporarily unavailable
>> >>>>
>> >>>> I have over 10GB RAM free, and zero swap spaced used. The box is a
>> >>>> dual quad core Intel Xeon 5405 with 16GB RAM.
>> >>>>
>> >>>> There is no error message in /var/log/messages or dmesg ...
>> >>>> how do I identify the problem?
>> >>>> thanks!
>> >>>>
>> >>>> uname -a
>> >>>> Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
>> >>>> x86_64 x86_64 x86_64 GNU/Linux
>> >>>>
>> >>>>
>> >>>> free -m
>> >>>> total used free shared buffers cached
>> >>>> Mem: 16086 3189 12896 0 42
>> >>>> 666
>> >>>> -/+ buffers/cache: 2481 13605
>> >>>> Swap: 1983 0 1983
>> >>>>
>> >>>>
>> >>>> have only 505 processes running
>> >>>> ps aux | wc -l
>> >>>> 505
>> >>>>
>> >>>>
>> >>>> uptime
>> >>>> 11:24:15 up 39 min, 1 user, load average: 3.54, 3.47, 2.87
>> >>>>
>> >>>> ulimit -a
>> >>>> core file size (blocks, -c) 0
>> >>>> data seg size (kbytes, -d) unlimited
>> >>>> scheduling priority (-e) 0
>> >>>> file size (blocks, -f) unlimited
>> >>>> pending signals (-i) 137216
>> >>>> max locked memory (kbytes, -l) 32
>> >>>> max memory size (kbytes, -m) unlimited
>> >>>> open files (-n) 32768
>> >>>> pipe size (512 bytes, -p) 8
>> >>>> POSIX message queues (bytes, -q) 819200
>> >>>> real-time priority (-r) 0
>> >>>> stack size (kbytes, -s) 10240
>> >>>> cpu time (seconds, -t) unlimited
>> >>>> max user processes (-u) 1024
>> >>>> virtual memory (kbytes, -v) unlimited
>> >>>> file locks (-x) unlimited
>> >>>
>> >>> The only place that fork() returns EAGAIN is for number of
>> >>> processes being >= its limit. Does this user already have >= 1024
>> >>> processes?
>> >>
>> >> No, it is around 400
>> >
>> > Well, my comment was wrong anyway. There are several other tests just
>> > below number of user processes that also return EAGAIN, like:
>> >
>> > - total number of threads being too large
>
> Total number of threads currently running is in /proc/loadavg:
>
>> cat /proc/loadavg
> 1.56 0.58 0.27 2/203 28500
>
> It's the number following the '/', e.g., 203 on my desktop system.
>
> max_threads allowed is a sysctl, so you can tune it if needed.
> It's in /proc/sys/kernel/threads-max:
>
>> cat /proc/sys/kernel/threads-max
> 32624
> I sort of doubt that one is the problem, but you can tell us.
cat /proc/loadavg
0.39 0.45 0.57 1/1412 12032
cat /proc/sys/kernel/threads-max
274432
you are right, i guess this is not the problem.
>> > - error on grabbing a module reference count (?)
>> > - error on grabbing a binfmt module reference
>>
>> as a user how do i identify what is wrong, and fix this? for total
>> number of threads -> is there anyway i can find out if this is causing
>> the problem? my system is running around 80 multi-threaded python web
>> apps.
>
> I can send you some debug patches that will print out the specific
> problem area. Do you want to do that? Can you rebuild and install
> a new kernel?
Is it possible to get this debug messages by turning on some flags?
If not yes, pl. send debug patches. its a live box and I will try to do it!
This is my system / kernel info:
uname -a
Linux XXX 2.6.25.3-18.fc9.x86_64 #1 SMP Tue May 13 04:54:47 EDT 2008
x86_64 x86_64 x86_64 GNU/Linux
thanks a lot!!!!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/