Re: BIND hangs with 2.6.14

From: Steinar H. Gunderson
Date: Sun Oct 30 2005 - 16:40:53 EST


On Sun, Oct 30, 2005 at 08:05:38PM +0100, Steinar H. Gunderson wrote:
> I have a run going in valgrind now to see if it can find anything bad about
> the pointers in the msg_hdr structure (the structure itself appears to be OK,
> judging from my printf-debugging); it's been going for a few hours, so I hope
> it will be entering its zombie mode now soon :-)

I finally caught it with gdb, after inserting some debug probes. Excerpts
(removed a few syntax errors):

(gdb) bt
#0 0xffffe405 in __kernel_vsyscall ()
#1 0x55840885 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0x55842002 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0x557e1383 in doio_recv (sock=0x80d1230, dev=0x8197ac8) at socket.c:917
#4 0x557e41d5 in internal_recv (me=0x81975a0, ev=0x80d1284) at socket.c:2012
#5 0x557d6259 in dispatch (manager=0x8094960) at task.c:855
#6 0x557d64c7 in run (uap=0x8094960) at task.c:998
#7 0x5580cca3 in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#8 0x558eff5a in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) up
#1 0x55840885 in raise () from /lib/tls/i686/cmov/libc.so.6
(gdb)
#2 0x55842002 in abort () from /lib/tls/i686/cmov/libc.so.6
(gdb)
#3 0x557e1383 in doio_recv (sock=0x80d1230, dev=0x8197ac8) at socket.c:917
917 abort();
(gdb) print msghdr
$1 = {msg_name = 0x8197b14, msg_namelen = 28, msg_iov = 0x561519e0, msg_iovlen = 1, msg_control = 0x809a810, msg_controllen = 52, msg_flags = 0}
(gdb) print msghdr.msg_name
$2 = (void *) 0x8197b14
(gdb) print (char *)msghdr.msg_name
$3 = 0x8197b14 ""
(gdb) print ((char *)msghdr.msg_name)[0]
$4 = 0 '\0'
(gdb) print ((char *)msghdr.msg_name)[27]
$5 = 0 '\0'
(gdb) print ((char *)msghdr.msg_control)[0]
$6 = 20 '\024'
(gdb) print ((char *)msghdr.msg_control)[51]
$7 = -66 'Â'
(gdb) print *(msghdr.msg_iov)
$9 = {iov_base = 0x8171208, iov_len = 4096}
(gdb) print ((char*)msghdr.msg_iov.iov_base)[0]
$10 = -30 'Ã'
(gdb) print ((char*)msghdr.msg_iov.iov_base)[4095]
$11 = -66 'Â'
(gdb) print sock->fd
$12 = 22
(gdb) print recvmsg(sock->fd, &msghdr, 0)
$14 = 42

IOW, the call that just failed suddenly worked in the debugger. I can't
really believe this is a BIND bug anymore... I'm lost here. Anyone?

/* Steinar */
--
Homepage: http://www.sesse.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/