Re: [PATCH 2/2] block: no-copy bvec for direct IO
From: Johannes Weiner
Date: Fri Dec 11 2020 - 11:17:13 EST
On Fri, Dec 11, 2020 at 03:47:23PM +0000, Pavel Begunkov wrote:
> On 11/12/2020 15:38, Johannes Weiner wrote:
> > On Fri, Dec 11, 2020 at 02:20:11PM +0000, Pavel Begunkov wrote:
> >> On 11/12/2020 14:06, Johannes Weiner wrote:
> >>> On Wed, Dec 09, 2020 at 08:40:05AM +0000, Christoph Hellwig wrote:
> >>>>> + /*
> >>>>> + * In practice groups of pages tend to be accessed/reclaimed/refaulted
> >>>>> + * together. To not go over bvec for those who didn't set BIO_WORKINGSET
> >>>>> + * approximate it by looking at the first page and inducing it to the
> >>>>> + * whole bio
> >>>>> + */
> >>>>> + if (unlikely(PageWorkingset(iter->bvec->bv_page)))
> >>>>> + bio_set_flag(bio, BIO_WORKINGSET);
> >>>>
> >>>> IIRC the feedback was that we do not need to deal with BIO_WORKINGSET
> >>>> at all for direct I/O.
> >>>
> >>> Yes, this hunk is incorrect. We must not use this flag for direct IO.
> >>> It's only for paging IO, when you bring in the data at page->mapping +
> >>> page->index. Otherwise you tell the pressure accounting code that you
> >>> are paging in a thrashing page, when really you're just reading new
> >>> data into a page frame that happens to be hot.
> >>>
> >>> (As per the other thread, bio_add_page() currently makes that same
> >>> mistake for direct IO. I'm fixing that.)
> >>
> >> I have that stuff fixed, it just didn't go into the RFC. That's basically
> >> removing replacing add_page() with its version without BIO_WORKINGSET
>
> I wrote something strange... Should have been "replacing add_page() in
> those functions with a version without BIO_WORKINGSET".
No worries, I understood.
> >> in bio_iov_iter_get_pages() and all __bio_iov_*_{add,get}_pages() +
> >> fix up ./fs/direct-io.c. Should cover all direct cases if I didn't miss
> >> some.
> >
> > Ah, that's fantastic! Thanks for clarifying.
>
> To keep it clear, do we go with what I have stashed (I'm planning to
> reiterate this weekend)? or you're going to write it up yourself?
> Just in case there is some cooler way you have in mind :)
Honestly, I only wrote all my ideas down and asked for feedback
because I wasn't super excited about any of them ;-)
If your changes happen to separate the direct io path from the
buffered io path naturally, I'm okay with it.
I'd say let's go with what you already have and see whether Jens and
Christoph like it. We can always do follow-on cleanups.