Re: nonblocking disk read again

Dan Kegel (dank@alumni.caltech.edu)
Wed, 13 Oct 1999 08:36:06 -0700


DK> It would be nice if one didn't have to hand the request to
DK> a worker thread if the file is already in memory,
DK> i.e. if sendfile() was able to return EWOULDBLOCK if
DK> the read operation would block because the data wasn't in core.

HH> I see a problem here. First - sendfile() will take some time
HH> even for a file in memory. Particularly a really big file.
HH> So you'll be blocking for a while,
HH> possibly delaying requests coming in via other network interfaces.

AV> The trick would be to fill the socket-buffer and THEN stop sendfile with
AV> EWOULDBLOCK, so that only kilobytes (instead of possibly megabytes) are
AV> transfered at a time. This way, one can keep the socket-buffers full
AV> (allowing full network-throughput) while keeping the time in one
sendfile()
AV> relatively small.

Right, except that you don't return EWOULDBLOCK, you return the
number of bytes transferred. This is how sendfile() is currently
specified for nonblocking network fd's.

I've given up on this idea for now, and will live with the context
switch overhead of having a worker thread issue the sendfile().

I asked sct whether my worker thread scheme uses sigwaitinfo()
correctly; here's his reply. Looks like some kind of nonblocking
disk read support may be coming in 2.5...

> > Put the client socket fd's into nonblocking mode.
> > Have the main thread call sigwaitinfo(), and place the client
> > events it gets into a mutex-protected queue.
> > Have a bunch of worker threads wait on the queue to
> > get an I/O need to handle.
>
> Fine
>
> > The workers will use sendfile to send file contents to the client.
> > If the main thread receives a SIGIO, the reliable signal queue
> > has overflowed, so it clears the queue by temporarily setting the
> > signal handler to SIG_DFL, then does a traditional poll(), and feeds
> > the events poll() returns into the queue. (If it gets another SIGIO
> > during this process, well, it just does another overflow poll,
> > and we devolve into the normal case.)
>
> Exactly. This is precisely how we have set up test servers using the
> new sigio code. btw, in 2.3.21 we finally have the V2 of that code
> which properly defines si_band as well as si_fd to tell you what sort
> of event happened, not just what fd it happened to. You can detect
> the version by checking whether si_code contains a valid POLL code
> (new version) or SI_SIGIO (old version).
>
> > There are tantalizing variations on this (have the worker threads
> > call sigwaitinfo themselves;
>
> That will be supported in the future, but right now we can't deal well
> enough with shared signal queues between threads.
>
> > add a nonblocking disk read feature to the kernel so the main thread
> > can do the sendfile's if the wouldn't block;
>
> Also planned. :) The kiobuf code in 2.3 (which is used as the basis of
> raw IO) is going to be the basis for kernel async IO functionality.

- Dan

-- 
(The above is my opinion alone, and not that of my employer)

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/