Re: [PATCH v3 0/6] block: fix integrity offset/length conversions

From: Martin K. Petersen

Date: Mon Apr 20 2026 - 22:09:35 EST



Hi Caleb!

> NVM Command Set specification 1.1 section 5.3.3 requires the reference
> tag to increment by 1 per logical block, so that seems to determine
> the increment unit:

SCSI allows PI to be interleaved at intervals smaller than the logical
block size. This was done for PI compatibility in mixed environments
with both 512[en] and 4Kn disks. Interleaving allows 8 bytes of PI per
512 bytes of data on devices using 4 KB logical blocks. That is the
reason why we use the term "integrity interval" instead of assuming
logical block size.

> The ref tag used for a particular block needs to be consistent. And
> since reftag(block N) can be computed as the reftag(M) + N - M if
> block N is accessed as part of an I/O that begins at block M, the
> function must be of the form reftag(block N) = N + c for some constant
> c. Thus, the ref tag seed needs to be computed in units of logical
> blocks (integrity intervals); no other unit (e.g. 512-byte sectors)
> works.

Whoever attaches the PI decides on the seed value. In the case of the
block layer it made sense to use block layer sector number since that
value is inevitably going to be the same for a future read.

Note that with MD, DM, and partitioning in the mix, the sector number
seen by whoever submits the I/O is going to be different from the LBAs
on the target devices which eventually receive the I/O. Nobody says
there is a computable constant offset. Think scattered LVM extent
allocations. Or RAID stripes placed at mismatched LBA offsets.

> To see the issue with the current approach, consider an example
> accessing LBA 1 on a device with a 4 KB block size. If the block is
> written as part of a write that begins at LBA 0, its ref tag in the
> generated PI will be 1 (sector 0 + 1 integrity interval). If it's
> later read by a read starting at LBA 1, its expected ref tag will be 8
> (sector 8 + 0 integrity intervals), and the auto-integrity code will
> fail the read due to a reftag mismatch.

Something is broken, then. Because the ref tag in the received PI should
have been remapped to start at 8 in that case.

> I agree, the seed doesn't need to match the final LBA, but it does
> need to be in *units* of logical blocks, plus some constant offset.

Your concept of "unit" still sends the wrong message. The seed is an
integer value used to initialize a counter or hardware register. The
seed only has meaning to whichever entity submits the I/O. To everything
else it is a value used for remapping ref tags from the I/O submitter's
point of view to whichever interpretation is mandated by the storage
hardware's PI format.

> With a ublk device. It should affect any block device that supports
> integrity and has a logical block size > 512.

It sounds like the seed value is set incorrectly for reads in your
configuration.

--
Martin K. Petersen