Re: [PATCH 16/17] block: use iomap for writes to block devices

From: Hannes Reinecke
Date: Fri May 19 2023 - 10:22:11 EST


On 4/24/23 07:49, Christoph Hellwig wrote:
Use iomap in buffer_head compat mode to write to block devices.

Signed-off-by: Christoph Hellwig <hch@xxxxxx>
---
block/Kconfig | 1 +
block/fops.c | 33 +++++++++++++++++++++++++++++----
2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index 941b2dca70db73..672b08f0096ab4 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -5,6 +5,7 @@
menuconfig BLOCK
bool "Enable the block layer" if EXPERT
default y
+ select IOMAP
select SBITMAP
help
Provide block layer support for the kernel.
diff --git a/block/fops.c b/block/fops.c
index 318247832a7bcf..7910636f8df33b 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -15,6 +15,7 @@
#include <linux/falloc.h>
#include <linux/suspend.h>
#include <linux/fs.h>
+#include <linux/iomap.h>
#include <linux/module.h>
#include "blk.h"
@@ -386,6 +387,27 @@ static ssize_t blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
return __blkdev_direct_IO(iocb, iter, bio_max_segs(nr_pages));
}
+static int blkdev_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+ unsigned int flags, struct iomap *iomap, struct iomap *srcmap)
+{
+ struct block_device *bdev = I_BDEV(inode);
+ loff_t isize = i_size_read(inode);
+
+ iomap->bdev = bdev;
+ iomap->offset = ALIGN_DOWN(offset, bdev_logical_block_size(bdev));
+ if (WARN_ON_ONCE(iomap->offset >= isize))
+ return -EIO;

I'm hitting this during booting:
[ 5.016324] <TASK>
[ 5.030256] iomap_iter+0x11a/0x350
[ 5.030264] iomap_readahead+0x1eb/0x2c0
[ 5.030272] read_pages+0x5d/0x220
[ 5.030279] page_cache_ra_unbounded+0x131/0x180
[ 5.030284] filemap_get_pages+0xff/0x5a0
[ 5.030292] filemap_read+0xca/0x320
[ 5.030296] ? aa_file_perm+0x126/0x500
[ 5.040216] ? touch_atime+0xc8/0x150
[ 5.040224] blkdev_read_iter+0xb0/0x150
[ 5.040228] vfs_read+0x226/0x2d0
[ 5.040234] ksys_read+0xa5/0xe0
[ 5.040238] do_syscall_64+0x5b/0x80

Maybe we should consider this patch:

diff --git a/block/fops.c b/block/fops.c
index 524b8a828aad..d202fb663f25 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -386,10 +386,13 @@ static int blkdev_iomap_begin(struct inode *inode, loff_t offset, loff_t length,

iomap->bdev = bdev;
iomap->offset = ALIGN_DOWN(offset, bdev_logical_block_size(bdev));
- if (WARN_ON_ONCE(iomap->offset >= isize))
- return -EIO;
- iomap->type = IOMAP_MAPPED;
- iomap->addr = iomap->offset;
+ if (WARN_ON_ONCE(iomap->offset >= isize)) {
+ iomap->type = IOMAP_HOLE;
+ iomap->addr = IOMAP_NULL_ADDR;
+ } else {
+ iomap->type = IOMAP_MAPPED;
+ iomap->addr = iomap->offset;
+ }
iomap->length = isize - iomap->offset;
if (IS_ENABLED(CONFIG_BUFFER_HEAD))
iomap->flags |= IOMAP_F_BUFFER_HEAD;


Other that the the system seems fine.

Cheers,

Hannes