Re: [PATCH 1/2] fs: add SEEK_HOLE and SEEK_DATA flags

From: Nick Bowler
Date: Mon Apr 25 2011 - 11:03:00 EST


Hi Eric,

On 2011-04-22 07:06 -0600, Eric Blake wrote:
> I've created a request to standardize SEEK_HOLE and SEEK_DATA in the
> next revision of POSIX; comments are welcome to make sure that everyone
> is happy with wording:
> http://austingroupbugs.net/view.php?id=415

Reading through your proposal, I think there is one thing that could be
clarified: the meaning of "the last hole" in the file. Consider the
following two file layouts -- in the "diagrams", a bar (|) indicates a
boundary between holes and non-hole data, and a double bar (||)
indicates end-of-file.

* File A (sparse file created by lseek/write beyond end-of-file):

data | hole 0 | data || hole 1 (virtual)

* File B (sparse file created by truncate beyond end-of-file):

data | hole 0 || hole 1 (virtual)

Excluding the error description, the term "the last hole" is used in
two places in your proposal:

* (for SEEK_HOLE): if offset falls within "the last hole", then the
file offset may be set to the file size instead.

* (for SEEK_DATA): it shall be an error ... if offset falls within the
last hole.

I imagine that both of these conditions are intended to address the
case where the offset falls within hole 0 in File B, that is, when
there is no non-hole data beyond the specified offset but the offset
is nevertheless less than the file size. However, this looks (to me)
like the penultimate hole in the file, not the last hole. Furthermore,
these conditions are presumably *not* intended to apply to the
penultimate hole in File A, which has data after it.

I think my confusion can be avoided by talking about the last non-hole
data byte in the file (which is unambigious), instead of by talking
about the last hole. For instance, the SEEK_HOLE/SEEK_DATA descriptions
could be written as follows:

If whence is SEEK_HOLE, the file offset shall be set to the smallest
location of a byte within a hole and not less than offset, except that
if offset falls beyond the last byte not within a hole, then the file
offset may be set to the file size instead. It shall be an error if
offset is greater or equal to the size of the file.

If whence is SEEK_DATA, the file offset shall be set to the smallest
location of a byte not within a hole and not less than offset. It shall
be an error if no such byte exists.

plus a corresponding update to the ENXIO description:

... or the whence argument is SEEK_DATA and the offset falls beyond
the last byte not within a hole.

--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/