Re: Exploring sigqueue leak (was: Strange 'zombie' problem both in 2.4 and 2.6)

From: Denis Vlasenko
Date: Fri Apr 02 2004 - 17:14:53 EST


On Friday 02 April 2004 15:05, Nikita V. Youshchenko wrote:
> Hello.
>
> Yesterday I posted to linux-kernel describind a problem that caused both
> 2.4 and 2.6 kernels to leave lots of zombies.
> I analysed the problem and found it is caused by overflow of 'sigqueue'
> slab cache (I may be wrong in terminology here). Details are in my
> previous post.
>
> Now I'm looking for advice from somebody who understands how stuff in
> linux/signal.c works.
>
> [the following ins about unpatched 2.6.4 kernel]
>
> Seems that 'sigqueue' objects are allocated exactly in 2 places: in
> __sigqueue_alloc() and in send_signal().
>
> __sigqueue_alloc() seems to be used only by sigqueue_alloc();
> sigqueue_alloc() seems only to be used by alloc_posix_timer() from
> kernel/posix-timers.c;
> alloc_posix_timer() is a static function used only by sys_timer_create()
> syscall handler.
> This syscall is not one widely used; I'm not sure it it is actually used by
> anything running here.
>
> So the leak seems to happen through allocation of 'sigqueue' objects in
> send_signal().
> However, if I understand correctly, object allocated here keeps information
> about signal being transferred, and lives as long as this tranfer is in
> progress.
> Signal transfer is somewhat very fast.
> So, if everything is working ok, usually zero 'sigqueue' objects should be
> allocated.
>
> Seems that it is possible to find out how many 'sigqueue' objects are
> currently allocated by running
> grep sigqueue /proc/slabinfo
> and looking at the first number of output.
> So seems that almost always the first number in the output of the above
> command should be zero.
> And that's exactly what happens on all hosts here expect the server.

Sorry, still no real help from me but maybe some useful data points:

On my home box:

# grep sigqueue /proc/slabinfo
sigqueue 18 25 156 25 1 : tunables 32 16 0 : slabdata 1 1 0 : globalstat 32 18 1 0 0 0 57 : cpustat 33581 2 33583 0
Another home box:
# grep sigqueue /proc/slabinfo
sigqueue 9 25 156 25 1 : tunables 32 16 0 : slabdata 1 1 0 : globalstat 16 16 1 0 0 0 57 : cpustat 71984 1 71985
Third box:
# grep sigqueue /proc/slabinfo
sigqueue 16 25 156 25 1 : tunables 32 16 0 : slabdata 1 1 0 : globalstat 16 16 1 0 0 0 57 : cpustat 1388121 1 1388122 0

No more 2.6 boxes for you here ;)

So, not exactly zero but small.

Now, 2.4 boxes:

# grep sigqueue /proc/slabinfo
sigqueue 0 29 132 0 1 1
# grep sigqueue /proc/slabinfo
sigqueue 8 29 132 1 1 1
# grep sigqueue /proc/slabinfo
sigqueue 0 0 132 0 0 1 : 252 126

Smaller.
--
vda

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/