Re: [PATCH v3 1/1] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

From: Al Viro
Date: Fri Aug 26 2022 - 15:33:30 EST


On Fri, Aug 26, 2022 at 08:16:59PM +0200, Fabio M. De Francesco wrote:

> As you may have already read, I'm so new to kernel development that I still
> know very little about many subsystems and drivers. I am not currently
> able to tell the difference between BVEC and KVEC. I could probably try to
> switch from one to the other (after learning from other code), however I won't
> be able to explain in the commit message why users should better use BVEC in
> this case.

struct kvec: pairs of form <kernel address, length>
struct bio_vec: triples of form <page, offset, length>

Either is a way to refer to a chunk of memory; the former obviously has it
already mapped (you don't get kernel addresses otherwise), the latter doesn't
need to.

iov_iter instances might be backed by different things, including
arrays of kvec (iov_iter_kvec() constructs such) and arrays of
bio_vec (iov_iter_bvec() is the constructor for those).

iov_iter primitives (copy_to_iter/copy_from_iter/copy_page_to_iter/etc.)
work with either variant - they look at the flavour and act accordingly.

ITER_BVEC ones tend to do that kmap_local_page() + copy + kunmap_local().
ITER_KVEC obviously use memcpy() for copying and that's it.

If you need e.g. to send some subranges of some pages you could kmap them,
form kvec array, make msg.msg_iter a KVEC-backed iterator over those and
feed it to sendmsg(). Or you could take a bio_vec array instead, make
msg.msg_iter a BVEC-backed iterator over that and feed to sendmsg().

The difference is, in the latter case kmap_local() will be done on demand
*inside* ->sendmsg() instance, when it gets around to copying some data
from the source and calls something like csum_and_copy_from_iter() or
whichever primitive it chooses to use.

Why bother with mapping the damn thing in the caller and having it pinned
all along whatever ->sendmsg() you end up calling? Just give it
page/offset/length instead of address/length and let lib/iov_iter.c
do the right thing...