Re: [PATCH] aio: partial write should not return error code.

From: Zach Brown
Date: Thu Jan 03 2008 - 15:04:58 EST


Rusty Russell wrote:
> When an AIO write gets an error after writing some data (eg. ENOSPC),
> it should return the amount written already, not the error. Just like
> write() is supposed to.

Andrew, please don't queue this fix. I think the bug is valid but the
patch is subtly dangerous.

> diff -r 18802689361a fs/aio.c
> --- a/fs/aio.c Thu Jan 03 15:22:24 2008 +1100
> +++ b/fs/aio.c Thu Jan 03 18:05:25 2008 +1100
> @@ -1346,6 +1350,13 @@ static ssize_t aio_rw_vect_retry(struct
> /* This means we must have transferred all that we could */
> /* No need to retry anymore */
> if ((ret == 0) || (iocb->ki_left == 0))
> + ret = iocb->ki_nbytes - iocb->ki_left;
> +
> + /* If we managed to write some out we return that, rather than
> + * the eventual error. */
> + if (opcode == IOCB_CMD_PWRITEV
> + && ret < 0
> + && iocb->ki_nbytes - iocb->ki_left)
> ret = iocb->ki_nbytes - iocb->ki_left;

This doesn't account for the (sigh) -EIOCB* error codes. They must be
returned to the caller so that it can properly handle the iocb reference
counting. Failure to do so can lead to oopses.

To be fair, I think you'll have a really hard time finding an
->aio_write() implementation which would return partial progress and
*then* one of the magical errnos. But the infrastructure does allow it.

So maybe we could get a helper in aio.h that abstracts out the

(ret < 0 && ret != -EIOCBQUEUED && ret != -EIOCBRETRY)

condition. Then I think this patch would be fine.

I assigned a bug to remind myself to revisit this if you aren't excited
by continuing with the patch:

http://bugzilla.kernel.org/show_bug.cgi?id=9681

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/