Re: [PATCH v20 12/12] null_blk: add support for copy offload

From: Nitesh Shetty
Date: Wed May 22 2024 - 01:02:22 EST


On 20/05/24 04:42PM, Bart Van Assche wrote:
On 5/20/24 03:20, Nitesh Shetty wrote:
+ if (blk_rq_nr_phys_segments(req) != BLK_COPY_MAX_SEGMENTS)
+ return status;

Why is this check necessary?

+ /*
+ * First bio contains information about destination and last bio
+ * contains information about source.
+ */

Please check this at runtime (WARN_ON_ONCE()?).

+ __rq_for_each_bio(bio, req) {
+ if (seg == blk_rq_nr_phys_segments(req)) {
+ sector_in = bio->bi_iter.bi_sector;
+ if (rem != bio->bi_iter.bi_size)
+ return status;
+ } else {
+ sector_out = bio->bi_iter.bi_sector;
+ rem = bio->bi_iter.bi_size;
+ }
+ seg++;
+ }

_rq_for_each_bio() iterates over the bios in a request. Does a copy
offload request always have two bios - one copy destination bio and
one copy source bio? If so, is 'seg' a bio counter? Why is that bio
counter compared with the number of physical segments in the request?

Yes, your observation is right. We are treating first bio as dst and
second as src. If not for that comparision, we might need to store the
index in a temporary variable and parse based on index value.

+ trace_nullb_copy_op(req, sector_out << SECTOR_SHIFT,
+ sector_in << SECTOR_SHIFT, rem);
+
+ spin_lock_irq(&nullb->lock);
+ while (rem > 0) {
+ chunk = min_t(size_t, nullb->dev->blocksize, rem);
+ offset_in = (sector_in & SECTOR_MASK) << SECTOR_SHIFT;
+ offset_out = (sector_out & SECTOR_MASK) << SECTOR_SHIFT;
+
+ if (null_cache_active(nullb) && !is_fua)
+ null_make_cache_space(nullb, PAGE_SIZE);
+
+ t_page_in = null_lookup_page(nullb, sector_in, false,
+ !null_cache_active(nullb));
+ if (!t_page_in)
+ goto err;
+ t_page_out = null_insert_page(nullb, sector_out,
+ !null_cache_active(nullb) ||
+ is_fua);
+ if (!t_page_out)
+ goto err;
+
+ in = kmap_local_page(t_page_in->page);
+ out = kmap_local_page(t_page_out->page);
+
+ memcpy(out + offset_out, in + offset_in, chunk);
+ kunmap_local(out);
+ kunmap_local(in);
+ __set_bit(sector_out & SECTOR_MASK, t_page_out->bitmap);
+
+ if (is_fua)
+ null_free_sector(nullb, sector_out, true);
+
+ rem -= chunk;
+ sector_in += chunk >> SECTOR_SHIFT;
+ sector_out += chunk >> SECTOR_SHIFT;
+ }
+
+ status = 0;
+err:
+ spin_unlock_irq(&nullb->lock);

In the worst case, how long does this loop disable interrupts?

We havn't measured this. But this should be similar to read and write in
present infra, as we followed similar approach.

+TRACE_EVENT(nullb_copy_op,
+ TP_PROTO(struct request *req,
+ sector_t dst, sector_t src, size_t len),
+ TP_ARGS(req, dst, src, len),
+ TP_STRUCT__entry(
+ __array(char, disk, DISK_NAME_LEN)
+ __field(enum req_op, op)
+ __field(sector_t, dst)
+ __field(sector_t, src)
+ __field(size_t, len)
+ ),

Isn't __string() preferred over __array() since the former occupies less space
in the trace buffer?

Again we followed the present existing implementation, to have a simpler
series to review.

Thank you,
Nitesh Shetty