Re: [PATCH v2 4/7] iov_iter: new iov_iter_pin_pages*() routines
From: Christoph Hellwig
Date: Fri Sep 23 2022 - 04:44:53 EST
On Fri, Sep 23, 2022 at 05:22:17AM +0100, Al Viro wrote:
> > Add a iov_iter_unpin_pages that does the right thing based on the
> > type. (block will need a modified copy of it as it doesn't keep
> > the pages array around, but logic will be the same).
>
> Huh? You want to keep the type (+ direction) of iov_iter in any structure
> a page reference coming from iov_iter_get_pages might end up in? IDGI...
Why would I? We generall do have or should have the iov_iter around.
And for the common case where we don't (bios) we can carry that
information in the bio as it needs a special unmap helper anyway.
But if you don't want to use the iov_iter for some reason, we'll just
need to condense the information to a flags variable and then pass that.
>
> BTW, speaking of lifetime rules - am I right assuming that fd_execute_rw()
> does IO on pages of the scatterlist passed to it?
Yes.
> Where are they getting
> dropped and what guarantees that IO is complete by that point?
The exact place depens on the exact taaraget frontend of which we have
a few. But it happens from the end_io callback that is triggered
through a call to target_complete_cmd.
> The reason I'm asking is that here you have an ITER_BVEC possibly fed to
> __blkdev_direct_IO_async(), with its
> if (iov_iter_is_bvec(iter)) {
> /*
> * Users don't rely on the iterator being in any particular
> * state for async I/O returning -EIOCBQUEUED, hence we can
> * avoid expensive iov_iter_advance(). Bypass
> * bio_iov_iter_get_pages() and set the bvec directly.
> */
> bio_iov_bvec_set(bio, iter);
> which does *not* grab the page referneces. Sure, bio_release_pages() knows
> to leave those alone and doesn't drop anything. However, what is the
> mechanism preventing the pages getting freed before the IO completion
> in this case?
The contract that callers of bvec iters need to hold their own
references as without that doing I/O do them would be unsafe. It they
did not hold references the pages could go away before even calling
bio_iov_iter_get_pages (or this open coded bio_iov_bvec_set).