[PATCH v1 00/15] Keep track of GUPed pages in fs and block

From: jglisse
Date: Thu Apr 11 2019 - 17:08:50 EST


From: JÃrÃme Glisse <jglisse@xxxxxxxxxx>

This patchset depends on various small fixes [1] and also on patchset
which introduce put_user_page*() [2] and thus is 5.3 material as those
pre-requisite will get in 5.2 at best. Nonetheless i am posting it now
so that it can get review and comments on how and what should be done
to test things.

For various reasons [2] [3] we want to track page reference through GUP
differently than "regular" page reference. Thus we need to keep track
of how we got a page within the block and fs layer. To do so this patch-
set change the bio_bvec struct to store a pfn and flags instead of a
direct pointer to a page. This way we can flag page that are coming from
GUP.

This patchset is divided as follow:
- First part of the patchset is just small cleanup i believe they
can go in as his assuming people are ok with them.
- Second part convert bio_vec->bv_page to bio_vec->bv_pfn this is
done in multi-step, first we replace all direct dereference of
the field by call to inline helper, then we introduce macro for
bio_bvec that are initialized on the stack. Finaly we change the
bv_page field to bv_pfn.
- Third part replace put_page(bv_page(bio_vec)) with a new helper
which will use put_user_page() when the page in the bio_vec is
coming from GUP.
- Fourth part update BIO to use bv_set_user_page() for page that
are coming from GUP this means updating bio_add_page*() to pass
down the origin of the page (GUP or not).
- Fith part convert few more places that directly use bvec_io or
BIO.

Note that after this patchset they are still places in the kernel where
we should use put_user_page*(). The intention is to separate that task
in chewable chunk (driver by driver, sub-system by sub-system).


I have only lightly tested this patchset (branch [4]) on my desktop and
have not seen anything obviously wrong but i might have miss something.
What kind of test suite should i run to stress test the vfs/block layer
around DIO and BIO ?


Note that you coccinelle [5] recent enough for the semantic patch to work
properly ([5] with git commit >= eac73d191e4f03d759957fc5620062428fadada8).

Cheers,
JÃrÃme Glisse

[1] https://cgit.freedesktop.org/~glisse/linux/commit/?h=gup-fs-block&id=5f67db69fd9f95d12987d2a030a82bc390e05a71
https://cgit.freedesktop.org/~glisse/linux/commit/?h=gup-fs-block&id=b070348d0e1fd9397eb8d0e97b4c89f1d04d5a0a
https://cgit.freedesktop.org/~glisse/linux/commit/?h=gup-fs-block&id=83691c86a6c8f560b5b78f3f57fcd62c0f3f1c7a
[2] https://lkml.org/lkml/2019/3/26/1395
[3] https://lwn.net/Articles/753027/
[4] https://cgit.freedesktop.org/~glisse/linux/log/?h=gup-fs-block
[5] https://github.com/coccinelle/coccinelle

Cc: linux-fsdevel@xxxxxxxxxxxxxxx
Cc: linux-block@xxxxxxxxxxxxxxx
Cc: linux-mm@xxxxxxxxx
Cc: John Hubbard <jhubbard@xxxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Johannes Thumshirn <jthumshirn@xxxxxxx>
Cc: Christoph Hellwig <hch@xxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Cc: Ming Lei <ming.lei@xxxxxxxxxx>
Cc: Dave Chinner <david@xxxxxxxxxxxxx>
Cc: Jason Gunthorpe <jgg@xxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Steve French <sfrench@xxxxxxxxx>
Cc: linux-cifs@xxxxxxxxxxxxxxx
Cc: samba-technical@xxxxxxxxxxxxxxx
Cc: Yan Zheng <zyan@xxxxxxxxxx>
Cc: Sage Weil <sage@xxxxxxxxxx>
Cc: Ilya Dryomov <idryomov@xxxxxxxxx>
Cc: Alex Elder <elder@xxxxxxxxxx>
Cc: ceph-devel@xxxxxxxxxxxxxxx
Cc: Eric Van Hensbergen <ericvh@xxxxxxxxx>
Cc: Latchesar Ionkov <lucho@xxxxxxxxxx>
Cc: Mike Marshall <hubcap@xxxxxxxxxxxx>
Cc: Martin Brandenburg <martin@xxxxxxxxxxxx>
Cc: devel@xxxxxxxxxxxxxxxxxx
Cc: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>
Cc: v9fs-developer@xxxxxxxxxxxxxxxxxxxxx
Cc: Coly Li <colyli@xxxxxxx>
Cc: Kent Overstreet <kent.overstreet@xxxxxxxxx>
Cc: linux-bcache@xxxxxxxxxxxxxxx
Cc: Ernesto A. FernÃndez <ernesto.mnd.fernandez@xxxxxxxxx>

JÃrÃme Glisse (15):
fs/direct-io: fix trailing whitespace issues
iov_iter: add helper to test if an iter would use GUP
block: introduce bvec_page()/bvec_set_page() to get/set
bio_vec.bv_page
block: introduce BIO_VEC_INIT() macro to initialize bio_vec structure
block: replace all bio_vec->bv_page by bvec_page()/bvec_set_page()
block: convert bio_vec.bv_page to bv_pfn to store pfn and not page
block: add bvec_put_page_dirty*() to replace put_page(bvec_page())
block: use bvec_put_page() instead of put_page(bvec_page())
block: bvec_put_page_dirty* instead of set_page_dirty* and
bvec_put_page
block: add gup flag to
bio_add_page()/bio_add_pc_page()/__bio_add_page()
block: make sure bio_add_page*() knows page that are coming from GUP
fs/direct-io: keep track of wether a page is coming from GUP or not
fs/splice: use put_user_page() when appropriate
fs: use bvec_set_gup_page() where appropriate
ceph: use put_user_pages() instead of ceph_put_page_vector()

Documentation/block/biodoc.txt | 7 +-
arch/m68k/emu/nfblock.c | 2 +-
arch/um/drivers/ubd_kern.c | 2 +-
arch/xtensa/platforms/iss/simdisk.c | 2 +-
block/bio-integrity.c | 8 +--
block/bio.c | 92 ++++++++++++++++-----------
block/blk-core.c | 2 +-
block/blk-integrity.c | 7 +-
block/blk-lib.c | 5 +-
block/blk-merge.c | 9 +--
block/blk.h | 4 +-
block/bounce.c | 26 ++++----
block/t10-pi.c | 4 +-
drivers/block/aoe/aoecmd.c | 4 +-
drivers/block/brd.c | 2 +-
drivers/block/drbd/drbd_actlog.c | 2 +-
drivers/block/drbd/drbd_bitmap.c | 4 +-
drivers/block/drbd/drbd_main.c | 4 +-
drivers/block/drbd/drbd_receiver.c | 6 +-
drivers/block/drbd/drbd_worker.c | 2 +-
drivers/block/floppy.c | 6 +-
drivers/block/loop.c | 16 ++---
drivers/block/null_blk_main.c | 6 +-
drivers/block/pktcdvd.c | 4 +-
drivers/block/ps3disk.c | 2 +-
drivers/block/ps3vram.c | 2 +-
drivers/block/rbd.c | 12 ++--
drivers/block/rsxx/dma.c | 3 +-
drivers/block/umem.c | 2 +-
drivers/block/virtio_blk.c | 4 +-
drivers/block/xen-blkback/blkback.c | 2 +-
drivers/block/zram/zram_drv.c | 24 +++----
drivers/lightnvm/core.c | 2 +-
drivers/lightnvm/pblk-core.c | 12 ++--
drivers/lightnvm/pblk-rb.c | 2 +-
drivers/lightnvm/pblk-read.c | 6 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/bcache/debug.c | 4 +-
drivers/md/bcache/request.c | 4 +-
drivers/md/bcache/super.c | 6 +-
drivers/md/bcache/util.c | 11 ++--
drivers/md/dm-bufio.c | 2 +-
drivers/md/dm-crypt.c | 18 ++++--
drivers/md/dm-integrity.c | 18 +++---
drivers/md/dm-io.c | 7 +-
drivers/md/dm-log-writes.c | 20 +++---
drivers/md/dm-verity-target.c | 4 +-
drivers/md/dm-writecache.c | 3 +-
drivers/md/dm-zoned-metadata.c | 6 +-
drivers/md/md.c | 4 +-
drivers/md/raid1-10.c | 2 +-
drivers/md/raid1.c | 4 +-
drivers/md/raid10.c | 4 +-
drivers/md/raid5-cache.c | 7 +-
drivers/md/raid5-ppl.c | 6 +-
drivers/md/raid5.c | 10 +--
drivers/nvdimm/blk.c | 6 +-
drivers/nvdimm/btt.c | 5 +-
drivers/nvdimm/pmem.c | 4 +-
drivers/nvme/host/core.c | 4 +-
drivers/nvme/host/tcp.c | 2 +-
drivers/nvme/target/io-cmd-bdev.c | 2 +-
drivers/nvme/target/io-cmd-file.c | 2 +-
drivers/s390/block/dasd_diag.c | 2 +-
drivers/s390/block/dasd_eckd.c | 14 ++--
drivers/s390/block/dasd_fba.c | 6 +-
drivers/s390/block/dcssblk.c | 2 +-
drivers/s390/block/scm_blk.c | 2 +-
drivers/s390/block/xpram.c | 2 +-
drivers/scsi/sd.c | 25 ++++----
drivers/staging/erofs/data.c | 6 +-
drivers/staging/erofs/unzip_vle.c | 4 +-
drivers/target/target_core_file.c | 6 +-
drivers/target/target_core_iblock.c | 4 +-
drivers/target/target_core_pscsi.c | 2 +-
drivers/xen/biomerge.c | 4 +-
fs/9p/vfs_addr.c | 4 +-
fs/afs/fsclient.c | 2 +-
fs/afs/rxrpc.c | 4 +-
fs/afs/yfsclient.c | 2 +-
fs/block_dev.c | 10 ++-
fs/btrfs/check-integrity.c | 6 +-
fs/btrfs/compression.c | 22 +++----
fs/btrfs/disk-io.c | 4 +-
fs/btrfs/extent_io.c | 16 ++---
fs/btrfs/file-item.c | 8 +--
fs/btrfs/inode.c | 20 +++---
fs/btrfs/raid56.c | 8 +--
fs/btrfs/scrub.c | 10 +--
fs/buffer.c | 4 +-
fs/ceph/file.c | 20 +++---
fs/cifs/connect.c | 4 +-
fs/cifs/misc.c | 14 ++--
fs/cifs/smb2ops.c | 2 +-
fs/cifs/smbdirect.c | 2 +-
fs/cifs/transport.c | 2 +-
fs/crypto/bio.c | 4 +-
fs/direct-io.c | 94 +++++++++++++++++++--------
fs/ext4/page-io.c | 4 +-
fs/ext4/readpage.c | 4 +-
fs/f2fs/data.c | 20 +++---
fs/gfs2/lops.c | 8 +--
fs/gfs2/meta_io.c | 4 +-
fs/gfs2/ops_fstype.c | 2 +-
fs/hfsplus/wrapper.c | 3 +-
fs/io_uring.c | 4 +-
fs/iomap.c | 10 +--
fs/jfs/jfs_logmgr.c | 4 +-
fs/jfs/jfs_metapage.c | 6 +-
fs/mpage.c | 6 +-
fs/nfs/blocklayout/blocklayout.c | 2 +-
fs/nilfs2/segbuf.c | 3 +-
fs/ocfs2/cluster/heartbeat.c | 2 +-
fs/orangefs/inode.c | 2 +-
fs/splice.c | 13 ++--
fs/xfs/xfs_aops.c | 8 +--
fs/xfs/xfs_buf.c | 2 +-
include/linux/bio.h | 13 ++--
include/linux/bvec.h | 99 +++++++++++++++++++++++++----
include/linux/uio.h | 11 ++++
kernel/power/swap.c | 2 +-
lib/iov_iter.c | 32 +++++-----
mm/page_io.c | 8 +--
net/ceph/messenger.c | 10 +--
net/sunrpc/xdr.c | 2 +-
net/sunrpc/xprtsock.c | 4 +-
126 files changed, 628 insertions(+), 467 deletions(-)

--
2.20.1