Re: [PATCH 002/437] fs: add generic read/write iterator helpers
From: Al Viro
Date: Mon Apr 15 2024 - 19:43:04 EST
On Mon, Apr 15, 2024 at 03:16:12PM -0600, Jens Axboe wrote:
> The old read/write path only handled user pointers, with the exception
> being bvecs mapped on the io_uring side (which the io_uring version
> dealt with) which also originally came from user pointers. So just user
> memory. Why would we need to expand that now? Who would be doing
> ->read_iter() or ->write_iter() with something that isn't either UBUF or
> IOVEC? Because that would break horrible as it is in the current kernel.
No, it will not. And yes, it does happen. Look, for example, at
fs/coredump.c:dump_emit_page(). ->write_iter() (regular file or pipe one)
is called. On ITER_BVEC.
It happens, and not only there. Look at how /dev/loop works, for crying
out loud! You get a struct request; the backing pages might very well have
_never_ been mapped to any userland address (e.g. when it's something like
a directory block). And you hit this:
static int lo_write_simple(struct loop_device *lo, struct request *rq,
loff_t pos)
{
struct bio_vec bvec;
struct req_iterator iter;
int ret = 0;
rq_for_each_segment(bvec, rq, iter) {
ret = lo_write_bvec(lo->lo_backing_file, &bvec, &pos);
if (ret < 0)
break;
cond_resched();
}
return ret;
}
with lo_write_bvec() being
static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
{
struct iov_iter i;
ssize_t bw;
iov_iter_bvec(&i, ITER_SOURCE, bvec, 1, bvec->bv_len);
bw = vfs_iter_write(file, &i, ppos, 0);
if (likely(bw == bvec->bv_len))
return 0;
printk_ratelimited(KERN_ERR
"loop: Write error at byte offset %llu, length %i.\n",
(unsigned long long)*ppos, bvec->bv_len);
if (bw >= 0)
bw = -EIO;
return bw;
}
Neither ->read_iter() nor ->write_iter() can make an assumption that it
will be used with userland buffer. And yes, it works...