Re: [RFC PATCH 0/7] block, fs: convert Direct IO to FOLL_PIN
From: Chaitanya Kulkarni
Date: Fri Feb 25 2022 - 11:14:28 EST
On 2/25/22 04:05, Jan Kara wrote:
> On Fri 25-02-22 00:50:18, John Hubbard wrote:
>> Hi,
>>
>> Summary:
>>
>> This puts some prerequisites in place, including a CONFIG parameter,
>> making it possible to start converting and testing the Direct IO part of
>> each filesystem, from get_user_pages_fast(), to pin_user_pages_fast().
>>
>> It will take "a few" kernel releases to get the whole thing done.
>>
>> Details:
>>
>> As part of fixing the "get_user_pages() + file-backed memory" problem
>> [1], and to support various COW-related fixes as well [2], we need to
>> convert the Direct IO code from get_user_pages_fast(), to
>> pin_user_pages_fast(). Because pin_user_pages*() calls require a
>> corresponding call to unpin_user_page(), the conversion is more
>> elaborate than just substitution.
>>
>> Further complicating the conversion, the block/bio layers get their
>> Direct IO pages via iov_iter_get_pages() and iov_iter_get_pages_alloc(),
>> each of which has a large number of callers. All of those callers need
>> to be audited and changed so that they call unpin_user_page(), rather
>> than put_page().
>>
>> After quite some time exploring and consulting with people as well, it
>> is clear that this cannot be done in just one patchset. That's because,
>> not only is this large and time-consuming (for example, Chaitanya
>> Kulkarni's first reaction, after looking into the details, was, "convert
>> the remaining filesystems to use iomap, *then* convert to FOLL_PIN..."),
>> but it is also spread across many filesystems.
>
> With having modified fs/direct-io.c and fs/iomap/direct-io.c which
> filesystems do you know are missing conversion? Or is it that you just want
> to make sure with audit everything is fine? The only fs I could find
> unconverted by your changes is ceph. Am I missing something?
if I understand your comment correctly file systems which are listed in
the list see [1] (all the credit goes to John to have a complete list)
that are not using iomap but use XXX_XXX_direct_IO() should be fine,
since in the callchain going from :-
XXX_XXX_direct_io()
__blkdev_direct_io()
do_direct_io()
...
submit_page_selection()
get/put_page() <---
will take care of itself ?
>
> Honza
>
[1]
jfs_direct_IO()
nilfs_direct_IO()
ntfs_dirct_IO()
reiserfs_direct_IO()
udf_direct_IO()
ocfs2_dirct_IO()
affs_direct_IO()
exfat_direct_IO()
ext2_direct_IO()
fat_direct_IO()
hfs_direct_IO()
hfs_plus_direct_IO()
-ck