Re: Thread implementations...

Linus Torvalds (torvalds@transmeta.com)
Thu, 25 Jun 1998 10:13:56 -0700 (PDT)


On Thu, 25 Jun 1998, Erik Corry wrote:
>
> [ "sendfile()" ]
>
> I'm a little curious as to which circumstances you are thinking of.
> As far as I can see, it's a syscall for a single application (a
> web server serving static objects) which is basically little more
> than a benchmark.

It's actually perfectly usable for other things too, like ftp servers etc.

The way I would probably implement it, it would actually work for "cp" as
well - you could "sendfile()" to another file, not just to a socket.

> If you really have such a hugely loaded web server
> you are likely to be doing lots of database lookups, cookie-controlled
> variable content, shtml, other cgi trickery, etc.

My personal observation has been that most webservers do mostly static
stuff, with a small percentage of dynamic behaviour. For example, even if
they have lots of CGI etc, often a big part of the page (bandwidth-wise)
tend to be pictures etc.

> And if you really
> just want to serve static objects as fast as possible, a round-robin
> DNS with multiple servers gets you more robustness and a solution that
> scales above Ethernet speeds.

That works if you have a _completely_ static setup. Which is one common
thing to have, but at the same time it is certainly not what most people
want to have.

> Would we just be doing this to look good agains NT in webstones?

We want to do that too. I don't think it's only that, though. The apache
people get some impressive numbers out of Linux, but when I talk to Dean
Gaudet I also very often get the feeling that in order to get better
numbers they have to do really bad things, and those things are going to
slow them down in many circumstances.

One thing is actually the latency of setting up a small transfer. This
sounds unimportant, but it's actually fairly important in order to do well
under load: the lower latency you have, the more likely you are to not get
into the bad situation that you have lots of outstanding requests and all
while you serve those you get new requests at the same rate and never make
any progress after a certain load.

That's one reason I don't like mmap() - it has horrible latency. mmap
under linux is fast, but it's really slow compared to what _could_ be
done. Similarly, "read()+write()" implies using user-space buffers, which
implies a certain amount of memory management and certainly bad
utilization of memory that could be better used for caching something.

And web serving is one of the things a lot of people want. And if they
make their judgements by benchmarks, we'd better be good at them. Never
discount benchmark numbers just because you don't like the benchmark: I
much prefer to go by real numbers than by "feeling".

I know some people that every time they see Linux beating somebody at a
benchmark, they claim that "the benchmark is meaningless, under real load
the issues are different". That's a cop-out. If NT is better than Linux at
something, we'd better look out or have a _really_ good explanation.. And I
think webstone is "real enough" that we can't really explain it away.

(I'm not saying NT is faster - I don't actually know the numbers. But I
don't want to be in the situation that it could be faster).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu