Re: Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent:AIO, aio_sendfile)

From: Ulrich Drepper
Date: Sat Aug 12 2006 - 15:09:07 EST

Suparna Bhattacharya wrote:
> I am wondering about that too. IIRC, the IO_NOTIFY_* constants are not
> part of the ABI, but only internal to the kernel implementation. I think
> Zach had suggested inferring THREAD_ID notification if the pid specified
> is not zero. But, I don't see why ->sigev_notify couldn't used directly
> (just like the POSIX timers code does) thus doing away with the
> new constants altogether. Sebestian/Laurent, do you recall?

I suggest to model the implementation after the timer code which does
exactly what we need.

> I'm guessing they are being used for validation of permissions at the time
> of sending the signal, but maybe saving the task pointer in the iocb instead
> of the pid would suffice ?

Why should any verification be necessary? The requests are generated in
the same process which will receive the notification. Even if the POSIX
process (aka, kernel process group) changes the IDs the notifications
should be set. The key is that notifications cannot be sent to another
POSIX process.

Adding this as a feature just makes things so much more complicated.

> So I think the
> intended behaviour is as you describe it should be

Then the documentation needs to be adjusted.

> The way it works (and better ideas are welcome) is that, since the io_submit()
> syscall already accepts an array of iocbs[], no new syscall was introduced.
> To implement lio_listio, one has to set up such an array, with the first iocb
> in the array having the special (new) grouping opcode of IOCB_CMD_GROUP which
> specifies the sigev notification to be associated with group completion
> (a NULL value of the sigev notification pointer would imply equivalent of

OK, this seems OK. We have to construct the iocb arrays dynamically anyway.

> My thought here was that it should be possible to include M as a parameter
> to the IOCB_CMD_GROUP opcode iocb, and thus incorporated in the lio control
> block ... then whatever semantics are agreed upon can be implemented.

If you have room for the parameter this is fine. For the beginning we
can enforce the number to be the same as the total number of requests.

> Let us know what you think about the listio interface ... hopefully the
> other issues are mostly simple to resolve.

It should be fine and I would support adding all this assuming the
normal file support (as opposed to direct I/O only) is added, too.

But I have one last question: sockets, pipes and the like are already
supported, right? If this is not the case we have a problem with the
currently proposed lio_listio interface.

â Ulrich Drepper â Red Hat, Inc. â 444 Castro St â Mountain View, CA â

Attachment: signature.asc
Description: OpenPGP digital signature