RE: [RFC PATCH] block, fs: use FOLL_LONGTERM as gup_flags for direct IO
From: Sooyong Suk
Date: Thu Mar 06 2025 - 21:07:26 EST
> On Fri, Mar 7, 2025 at 12:26 AM Christoph Hellwig <hch@xxxxxxxxxxxxx>
> wrote:
> >
> > On Thu, Mar 06, 2025 at 04:40:56PM +0900, Sooyong Suk wrote:
> > > There are GUP references to pages that are serving as direct IO
> buffers.
> > > Those pages can be allocated from CMA pageblocks despite they can be
> > > pinned until the DIO is completed.
> >
> > direct I/O is eactly the case that is not FOLL_LONGTERM and one of the
> > reasons to even have the flag. So big fat no to this.
> >
>
Understood.
> Hello, thank you for your comment.
> We, Sooyong and I, wanted to get some opinions about this FOLL_LONGTERM
> for direct I/O as CMA memory got pinned pages which had been pinned from
> direct io.
>
> > You also completely failed to address the relevant mailinglist and
> > maintainers.
>
> I added block maintainer Jens Axboe and the block layer maillinst here,
> and added Suren and Sandeep, too.
Then, what do you think of using PF_MEMALLOC_PIN for this context as below?
This will only remove __GFP_MOVABLE from its allocation flag.
Since __bio_iov_iter_get_pages() indicates that it will pin user or kernel pages,
there seems to be no reason not to use this process flag.
block/bio.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index 65c796ecb..671e28966 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1248,6 +1248,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
unsigned len, i = 0;
size_t offset;
int ret = 0;
+ unsigned int flags;
/*
* Move page array up in the allocated memory for the bio vecs as far as
@@ -1267,9 +1268,11 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
* result to ensure the bio's total size is correct. The remainder of
* the iov data will be picked up in the next bio iteration.
*/
+ flags = memalloc_pin_save();
size = iov_iter_extract_pages(iter, &pages,
UINT_MAX - bio->bi_iter.bi_size,
nr_pages, extraction_flags, &offset);
+ memalloc_pin_restore(flags);
if (unlikely(size <= 0))
return size ? size : -EFAULT;