BIND problem with hundreds of IP aliases under Linux 2.0.33

Ben Gertzfield (che@debian.org)
22 Jan 1998 18:47:45 -0800


Software: bind
Version: 8.1.1
Platform: Debian GNU/Linux, libc6, kernel 2.0.33

I'm running into a problem with starting up BIND on a machine with
hundreds and hundreds of IP aliased eth0 interfaces. I want a
cache-only nameserver on our web box that serves about 240 virtual
domains, but BIND dies on startup, with this message:

[root@everybody:~]# /etc/init.d/bind start 6:45PM
Starting domain name service: namedsocket(SOCK_STREAM): Invalid argument
/etc/init.d/bind: line 32: 3744 Aborted start-stop-daemon --start --quiet --exec /usr/sbin/named
.

I thought it *might* be a kernel limit, so I increased the number of
max file descriptors and open inodes and free pages like so:

echo 4096 > /proc/sys/kernel/file-max
echo 12288 > /proc/sys/kernel/inode-max
echo 300 400 500 > /proc/sys/vm/freepages

to no avail.

The only solution was to take down about 200 of the IP aliased eth0
interfaces -- then BIND started up quite happily. The limit seems to
be reached at about 100 or so IP aliases.

If it could be confirmed that this is a kernel problem, I'd be very
happy. If it could be confirmed that this is a BIND problem, I'd also
be pretty happy. :)

To be exact, there are 243 IP aliases on this box. BIND only refuses
to start up if there are more than 100 or so.

Here's a bit of the relevant strace:

1188 fcntl(19, F_DUPFD, 20) = -1 EMFILE (Too many open files)
1188 gettimeofday({885521914, 532826}, NULL) = 0
1188 time([885521914]) = 885521914
1188 getpid() = 1188
1188 sigaction(SIGPIPE, {0x4007c380, [], 0}, {0x8061740, [], SA_STACK|0xdec30})
= 0
1188 send(3, "<29>Jan 22 18:18:34 named[1188]:"..., 78, 0) = 78
1188 sigaction(SIGPIPE, {0x8061740, [], 0x4a}, NULL) = 0
1188 setsockopt(19, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1188 getsockopt(19, SOL_SOCKET, SO_RCVBUF, [65535], [4]) = 0
1188 bind(19, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("209.
133.21.134")}, 16) = 0
1188 fcntl(19, F_GETFL) = 0x2 (flags O_RDWR)
1188 fcntl(19, F_SETFL, O_RDWR|O_NONBLOCK) = 0
1188 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = -1 EINVAL (Invalid argument)
1188 gettimeofday({885521914, 536597}, NULL) = 0
1188 time([885521914]) = 885521914
1188 getpid() = 1188
1188 sigaction(SIGPIPE, {0x4007c380, [], 0}, {0x8061740, [], SA_STACK|0xdec50})
= 0
1188 send(3, "<26>Jan 22 18:18:34 named[1188]:"..., 71, 0) = 71
1188 sigaction(SIGPIPE, {0x8061740, [], 0x4a}, NULL) = 0
1188 gettimeofday({885521914, 538948}, NULL) = 0
1188 time([885521914]) = 885521914
1188 getpid() = 1188
1188 sigaction(SIGPIPE, {0x4007c380, [], 0}, {0x8061740, [], SA_STACK|0xdec50})
= 0
1188 send(3, "<26>Jan 22 18:18:34 named[1188]:"..., 71, 0) = 71
1188 sigaction(SIGPIPE, {0x8061740, [], 0x4a}, NULL) = 0
1188 write(2, "socket(SOCK_STREAM): Invalid arg"..., 38) = 38
1188 sigprocmask(SIG_UNBLOCK, [ABRT], NULL) = 0
1188 getpid() = 1188
1188 kill(1188, SIGABRT) = 0
1188 --- SIGABRT (Aborted) ---

Thanks,

Ben Gertzfield

-- 
Brought to you by the letters A and M and the number 9.
"Mmm.. Soylent Green.." -- Homer Simpson
Ben Gertzfield <http://www.imsa.edu/~wilwonka/> Finger me for my public
PGP key. I'm on FurryMUCK as Che, and EFNet and YiffNet IRC as Che_Fox.