Re: [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support

From: Eric Blake
Date: Wed Apr 03 2024 - 15:28:33 EST


On Wed, Apr 03, 2024 at 01:58:38PM -0400, Stefan Hajnoczi wrote:
> > Yes, we want to ensure that:
> >
> > off1 = lseek(fd, -1, SEEK_END);
> > off2 = off1 + 1; // == lseek(fd, 0, SEEK_END)
> >
> > if off1 belongs to a data extent:
> > - lseek(fd, off1, SEEK_DATA) == off1
> > - lseek(fd, off1, SEEK_HOLE) == off2
> > - lseek(fd, off2, SEEK_DATA) == -ENXIO
> > - lseek(fd, off2, SEEK_HOLE) == -ENXIO
>
> Agreed.
>
> > if off1 belongs to a hole:
> > - lseek(fd, off1, SEEK_DATA) == -ENXIO
> > - lseek(fd, off1, SEEK_HOLE) == off1
> > - lseek(fd, off2, SEEK_DATA) == -ENXIO
> > - lseek(fd, off2, SEEK_HOLE) == -ENXIO
>
> Agreed.
>
..
> >
> > That is, we can never pass in off2 (beyond the bounds of the table),
> > and when passing in off1, I think this interface may be easier to work
> > with in the intermediate layers, even though it differs from the
> > lseek() interface above. For off1 in data:
> > - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off1
> > - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off2
> > and for a hole:
> > - dm_blk_do_seek_hole_data(dm, off1, SEEK_DATA) == off2
> > - dm_blk_do_seek_hole_data(dm, off1, SEEK_HOLE) == off1
>

In the caller, we already need to know both off1 and off2 (that is,
the offset we are asking about, AND the offset at which the underlying
component ends its map at, since that is where we are then in charge
of whether to concatenate that with the next component or give the
overall result).

If we already guarantee that we never call into a lower layer with
off2 (because it was beyond bounds), then the only difference between
the two semantics is whether SEEK_DATA in a final hole returns -ENXIO
or off2; and it looks like we can do a conversion from either style
into the other.

So designing the caller logic, it looks like I would want:

if off1 >= EOF return -ENXIO (out of bounds)

while off1 < EOF:

determine off2 of current lower region
at this point, we are guaranteed off1 < off2
we also know that off2 < EOF unless we are on last lower region
call result=lower_layer(off1, SEEK_X)
it is a bug if result is non-negative but < off1, or if result > off2

if result == off1, return result (we are already in the right HOLE
or DATA)

if result > off1 and < off2, return result (we had to advance to get
to the right region, but the right region was within the lower
layer, and we don't need to inspect further layers)

Note - the above two paragraphs can be collapsed into one: if result
< off2, return result (the current subregion gave us an adequate
answer)

if result == off2, continue to the next region with off1=result (in
first semantics, this is only possible for SEEK_HOLE for a lower
region ending in data; in the second semantics, it is possible for
either SEEK_HOLE or SEEK_DATA for a lower region ending in the type
opposite of the request)

if result == -ENXIO, continue to the next region by using off1=off2
(only possible in the first semantics for SEEK_DATA on a lower
region ending in a hole)

any other result is necessarily a negative errno like -EIO; return it

if the loop completes without an early return, we are out of lower
regions, and we should be left with off1 == EOF. Because we advanced,
we know now to:
return whence == SEEK_HOLE ? off1 : -ENXIO

> I'll take a look again starting from block/fops.c, through dm.c, and
> into dm-linear.c to see how to make things clearest. Although I would
> like to have the same semantics for every seek function, maybe in the
> end your suggestion will make the code clearer. Let's see.

Keeping lseek semantics may require a couple more lines of code
duplicated at each layer, but less maintainer headaches in the long
run of converting between slightly different semantics depending on
which layer you are at.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org