Re: O_DIRECT wierd behavior..

From: Riley Williams (rhw@MemAlpha.cx)
Date: Wed Dec 26 2001 - 09:54:38 EST


Hi Andrew, Linus.

>> RETURN VALUE
>>
>> Upon successful completion, write() and pwrite() will return the
>> number of bytes actually written to the file associated with
>> fildes. This number will never be greater than nbyte. Otherwise,
>> -1 is returned and errno is set to indicate the error.
>>
>> I take that to mean that if an error occurs, we return that
>> error regardless of how much was written.

> I disagree.
>
> Note that writing 15 characters out of 30 is also a "successful write" -
> it's just a _partial_ write.
>
> So it is acceptable to return an intermediate value.

In the "Linux Device Drivers" book, the author writes quite a lot of
drivers for virtual devices that actually rely on the ability to return
short writes in many cases. Many of the standard UNIX utilities rely on
just the same behaviour as well.

>> Which makes sense. Consider this code...
>>
>> open(file)
>> write(100k)
>> close(fd)

...which ought to look like this code...

        open(file)
        ptr = fillbuffer(100k)
        endptr = ptr + 100k
        while (ptr < endptr) {
                result += write(ptr, endptr - ptr)
                if (result > 0)
                        ptr += result
                else
                        ERROR
        }
        close(file)

...as anything else is just plain buggy and can't be otherwise.

>> if the write gets an IO error halfway through, it looks like
>> the caller never gets to hear about it at present.

> No, the caller _does_ get to hear about it. If the caller cares
> about robust handling, it will notice "Hmm, I tried to write 100k
> bytes, but the system only wrote 50k, what's up"?
>
> Note that the caller _has_ to do this anyway, or it wouldn't be able
> to handle things like interruptible NFS mounts, sockets, pipes,
> out-of-disk errors etc etc.

It's also not exactly hard to do the right thing - there's little extra
in the revised code snippet to the original one.

> And if the caller does _not_ care about robustness, then who cares?
> It's going to ignore whatever we return anyway.

Precicely why that sort of code is so essential.

Best wishes from Riley.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Dec 31 2001 - 21:00:12 EST