Re: slow open() calls and o_nonblock

From: Trond Myklebust
Date: Mon Jun 04 2007 - 15:46:30 EST


On Mon, 2007-06-04 at 12:26 -0400, Aaron Wiebe wrote:
> Actually, lets see if I can summarize this more generically... I
> realize I'm suggesting something that probably would be a massive
> undertaking, but ..
>
> Regular files are the only interface that requires an application to
> wait. With any other case, the nonblocking interfaces are fairly
> complete and easy to work with. If userspace could treat regular
> files in the same fashion as sockets, life would be good.
>
> I admittedly do not understand internal kernel semantics in the
> differences between a socket and a regular file. Why couldn't we just
> have a different 'socket type' like PF_FILE or something like this?
>
> Abstracting any IO through the existing interfaces provided to sockets
> would be ideal from my perspective. The code required to use a file
> through these interfaces would be more complex in userspace, but the
> abstraction of the current open() itself could simply be an aggregate
> of these interfaces without a nonblocking flag.
>
> It would, however, fix problems around issues with event-based
> applications handling events from both disk and sockets. I can't
> trigger disk read/write events in the same event handlers I use for
> sockets (ie, poll or epoll). I end up having two separate event
> handlers - one for disk (currently using glibc's aio thread kludge),
> and one for sockets.
>
> I'm sure this isn't a new idea. Coming from my own development
> backround that had little to do with disk, I was actually surprised
> when I first discovered that I couldn't edge-trigger disk IO through
> poll().
>
> Thoughts, comments?

Unless you're planning on rearchitecting the entire VFS lookup and
permissions code, you would basically have to fall back onto having a
pool of service threads actually perform the I/O. That can just as
easily be done today in userland.

AFAICS, syslets should give you the means to implement a more scalable
scheme, but we'll have to wait and see if/when those are ready for
kernel inclusion.

Cheers
Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/