Re: How to increat [sic.] max open files?

James L. McGill (fishbowl@fotd.netcomi.com)
Fri, 3 Jan 1997 05:00:00 -0600 (CST)


On Thu, 2 Jan 1997, Richard B. Johnson wrote:

> If you make space available for
> more than "normal" (whatever that is), you waste valuable RAM.

RAM value is relative to the value of the application and the need
to have it run.

> If we were
> not stuck with "FILE *file" stuff in 'C', the maximum number of FD's could
> be limited only by the largest positive number that can be described by
> an "int" on the platform in use.

Agreed.

> This is not true of all operating systems. Some operating systems make
> artificial limits, but the physical limit is ONLY the maximum number that
> can be stored in an "int". Anyway, your "unlimited" file-handles isn't
> possible.

Agreed, given that the current state of the art in hardware will fall short
long before the 32 bit integer is reached, in this context.

> Even a long int or a quadword, etc., have limits. This presumes
> that such a handle won't be used as an index into a fixed-length table of
> some sort.

Agreed.

> Therefore, the natural question is:
>
> Now many files SHOULD a process access?

Well, my original question was:

How can I maximize the number of Virtual HTTPd servers
using Apache and Intel Linux? Currently I am able to
run 123 servers per Machine with a stock Linux kernel.
This is not enough for my needs.

By patching the kernel to 1024 File Handles I am able
to increase that number proportionally. I need to increase
that even more, as the hardware on which we run is quite
capable of handling much more. The only thing that forces
this limit is the limitation designed into Linux and Libc.
The reasons for this design limit are unclear.

My problem is:

It is known that other Unix platforms have supported
solutions to this exact problem. I and my colleagues
have applied pressure to keep Linux as our preferred
system, despite this limitation. One reason we have
done this is political: WE WANT TO CONTINUE TO BE A
SUPPORTER OF LINUX IN A LARGE SCALE COMMERCIAL ENVIRONMENT,
as we always have been. There can never be enough people
willing to run Linux as their mission-critical system.
We thoroughly enjoy this status, and we are aware of the
risks.

We have reached a point where a design limitation of Linux
is a problem for us. I have confidence that it will be
fixed before we are forced to migrate to another platform.

> If you state "all it wants", then
> you need some other kind of operating system.

Nobody in the Linux community really wants that to be the solution.

> If you state 10,000, I can
> show a good reason why you will need 10,001.

I want to have two instances of Apache httpd, each with 506 Virtual Servers,
each with four open files, plus the overhead of httpd itself, plus enough FD's
available for CGI's to run, etc.

I think this is reasonable. It's do-able on Solaris and BSD. The notion
that I will have to use one of those brings a tear to my eye. I hoped that
fact would get the attention of the Linux developers.

> In other words, there MUST
> be some kind of limit.

Yes, and the limit should be commensurate with what the hardware is capable of
doing. 256 FD's per process is woefully smaller than what a PPro200 with 512MB
of RAM is capable of doing. I happen to have an application which needs more,
and by using more will still not push the hardware (or the OS for that matter)
to its limits. I do not think I am the only one with such a need.

> I think that a task, process, program, etc., that needs more than 100
> file handles is improperly written.

I agree with that. But Apache Httpd is a pretty well written program.
It's widely accepted as the best solution for a large scale web server.
We have never had a problem that can be attributed to a problem with
Apache. I do not intend to take on the task of rewriting it, but I have
notified the developers of this problem.

> Keeping that many files open at any
> one time will cause file destruction if the system crashes.

There is always file destruction when the system crashes. So?

> On the other
> hand, opening/reading/writing/closing files in rapid sucession is not
> very efficient.

Absolutely fatal if we take it to this scale.

> A file-handle limit forces a programmer to think about
> this and design (rather than just write), the program.

I agree with that in principle.
I would agree with that in this context if Apache couldn't do this on Solaris
and BSD either.

> Lets say that you have a "mount daemon" that is going to perform NFS
> file system access for thousands of clients on the network. I think
> another damon should be created if the first runs out of file-handles.
> Each time a daemon's resource capability is exceeded, another is created.

Agreed. Ther reason that an HTTPD server must run as one big process
is because it must bind all those addresses to one port: 80.

BTW, I do not have a mount daemon for thousands of clients on the network.
Has there been a problem scaling Linux to large NFS applications as well?

> Each time a daemon closes its last file, it expires. Now, you have 100
> daemons when you need them and 1 daemon when you only need it.

That would work except that there can only be one port 80.

> The same is true of database programs, etc. There must be some kind of
> discipline enforced by the operating system or you name it chaos.

Agreed. However, this will not be a problem when Linux allows, e.g.,
four thousand open files per process. I solemnly swear that I will
not come back and say: "Oops. I need 8192." Somebody with an IRC
server probably will though.

--
g-r-a-t-e-f-u-l-l-y---[   email:<fishbowl@conservatory.com>   ]---l-i-v-i-n-g
d-e-a-d-i-c-a-t-e-d---[ http://www.conservatory.com/~fishbowl ]-----l-i-g-h-t

Quantum Mechanics is God's version of "Trust me."