On Fri, Mar 21, 2025 at 12:48:16PM -0600, Caleb Sander Mateos wrote:
To use ublk zero copy, an application submits a sequence of io_uring
operations:
(1) Register a ublk request's buffer into the fixed buffer table
(2) Use the fixed buffer in some I/O operation
(3) Unregister the buffer from the fixed buffer table
The ordering of these operations is critical; if the fixed buffer lookup
occurs before the register or after the unregister operation, the I/O
will fail with EFAULT or even corrupt a different ublk request's buffer.
It is possible to guarantee the correct order by linking the operations,
but that adds overhead and doesn't allow multiple I/O operations to
execute in parallel using the same ublk request's buffer. Ideally, the
application could just submit the register, I/O, and unregister SQEs in
the desired order without links and io_uring would ensure the ordering.
So far there are only two ways to provide the order guarantee in io_uring
syscall viewpoint:
1) IOSQE_IO_LINK
2) submit register_buffer operation and wait its completion, then submit IO
operations
Otherwise, you may just depend on the implementation, and there isn't such
order guarantee, and it is hard to write generic io_uring application.
I posted sqe group patchset for addressing this particular requirement in
API level.
https://lore.kernel.org/linux-block/20241107110149.890530-1-ming.lei@xxxxxxxxxx/
Now I'd suggest to re-consider this approach for respecting the order
in API level, so both application and io_uring needn't play trick for
addressing this real problem.
With sqe group, just two OPs are needed:
- provide_buffer OP(group leader)
- other generic OPs(group members)
group leader won't be completed until all group member OPs are done.
The whole group share same IO_LINK/IO_HARDLINK flag.
That is all the concept, and this approach takes less SQEs, and application
will become simpler too.