Re: [PATCH v1 10/15] block: add gup flag to bio_add_page()/bio_add_pc_page()/__bio_add_page()

From: Dan Williams
Date: Tue Apr 16 2019 - 12:54:37 EST


On Tue, Apr 16, 2019 at 9:47 AM Jan Kara <jack@xxxxxxx> wrote:
>
> On Mon 15-04-19 11:24:33, Jerome Glisse wrote:
> > On Mon, Apr 15, 2019 at 04:59:52PM +0200, Jan Kara wrote:
> > > Hi Jerome!
> > >
> > > On Thu 11-04-19 17:08:29, jglisse@xxxxxxxxxx wrote:
> > > > From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>
> > > >
> > > > We want to keep track of how we got a reference on page added to bio_vec
> > > > ie wether the page was reference through GUP (get_user_page*) or not. So
> > > > add a flag to bio_add_page()/bio_add_pc_page()/__bio_add_page() to that
> > > > effect.
> > >
> > > Thanks for writing this patch set! Looking through patches like this one,
> > > I'm a bit concerned. With so many bio_add_page() callers it's difficult to
> > > get things right and not regress in the future. I'm wondering whether the
> > > things won't be less error-prone if we required that all page reference
> > > from bio are gup-like (not necessarily taken by GUP, if creator of the bio
> > > gets to struct page he needs via some other means (e.g. page cache lookup),
> > > he could just use get_gup_pin() helper we'd provide). After all, a page
> > > reference in bio means that the page is pinned for the duration of IO and
> > > can be DMAed to/from so it even makes some sense to track the reference
> > > like that. Then bio_put() would just unconditionally do put_user_page() and
> > > we won't have to propagate the information in the bio.
> > >
> > > Do you think this would be workable and easier?
> >
> > It might be workable but i am not sure it is any simpler. bio_add_page*()
> > does not take page reference it is up to the caller to take the proper
> > page reference so the complexity would be push there (just in a different
> > place) so i don't think it would be any simpler. This means that we would
> > have to update more code than this patchset does.
>
> I agree that the amount of work in this patch set is about the same
> (although you don't have to pass the information about reference type in
> the biovec so you save the complexities there). But for the future the
> rule that "bio references must be gup-pins" is IMO easier to grasp for
> developers and you can reasonably assert it in bio_add_page().
>
> > This present patch is just a coccinelle semantic patch and even if it
> > is scary to see that many call site, they are not that many that need
> > to worry about the GUP parameter and they all are in patch 11, 12, 13
> > and 14.
> >
> > So i believe this patchset is simpler than converting everyone to take
> > a GUP like page reference. Also doing so means we loose the information
> > about GUP kind of defeat the purpose. So i believe it would be better
> > to limit special reference to GUP only pages.
>
> So what's the difference whether the page reference has been acquired via
> GUP or via some other means? I don't think that really matters. If say
> infiniband introduced new ioctl() that takes file descriptor, offset, and
> length and just takes pages from page cache and attaches them to their RDMA
> scatter-gather lists, then they'd need to use 'pin' references anyway...
>
> Then why do we work on differentiating between GUP pins and other page
> references? Because it matters what the reference is going to be used for
> and what is it's lifetime. And generally GUP references are used to do IO
> to/from page and may even be controlled by userspace so that's why we need
> to make them different. But in principle the 'gup-pin' reference is not about
> the fact that the reference has been obtained from GUP but about the fact
> that it is used to do IO. Hence I think that the rule "bio references must
> be gup-pins" makes some sense.

+1 to this idea. I don't see the need to preserve the concept that
some biovecs carry non-GUP pages.