Re: [PATCH RFC] io_uring: Pass whole sqe to commands

From: Pavel Begunkov
Date: Fri Apr 14 2023 - 09:12:50 EST


On 4/14/23 03:12, Ming Lei wrote:
On Thu, Apr 13, 2023 at 09:47:56AM -0700, Breno Leitao wrote:
Hello Ming,

On Thu, Apr 13, 2023 at 10:56:49AM +0800, Ming Lei wrote:
On Thu, Apr 06, 2023 at 09:57:05AM -0700, Breno Leitao wrote:
Currently uring CMD operation relies on having large SQEs, but future
operations might want to use normal SQE.

The io_uring_cmd currently only saves the payload (cmd) part of the SQE,
but, for commands that use normal SQE size, it might be necessary to
access the initial SQE fields outside of the payload/cmd block. So,
saves the whole SQE other than just the pdu.

This changes slighlty how the io_uring_cmd works, since the cmd
structures and callbacks are not opaque to io_uring anymore. I.e, the
callbacks can look at the SQE entries, not only, in the cmd structure.

The main advantage is that we don't need to create custom structures for
simple commands.

Suggested-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
---

...

diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 2e4c483075d3..9648134ccae1 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -63,14 +63,15 @@ EXPORT_SYMBOL_GPL(io_uring_cmd_done);
int io_uring_cmd_prep_async(struct io_kiocb *req)
{
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
- size_t cmd_size;
+ size_t size = sizeof(struct io_uring_sqe);
BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16);
BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80);
- cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128);
+ if (req->ctx->flags & IORING_SETUP_SQE128)
+ size <<= 1;
- memcpy(req->async_data, ioucmd->cmd, cmd_size);
+ memcpy(req->async_data, ioucmd->sqe, size);

The copy will make some fields of sqe become READ TWICE, and driver may see
different sqe field value compared with the one observed in io_init_req().

This copy only happens if the operation goes to the async path
(calling io_uring_cmd_prep_async()). This only happens if
f_op->uring_cmd() returns -EAGAIN.

ret = file->f_op->uring_cmd(ioucmd, issue_flags);
if (ret == -EAGAIN) {
if (!req_has_async_data(req)) {
if (io_alloc_async_data(req))
return -ENOMEM;
io_uring_cmd_prep_async(req);
}
return -EAGAIN;
}

Are you saying that after this copy, the operation is still reading from
sqe instead of req->async_data?

I meant that the 2nd read is on the sqe copy(req->aync_data), but same
fields can become different between the two READs(first is done on original
SQE during io_init_req(), and second is done on sqe copy in driver).

Will this kind of inconsistency cause trouble for driver? Cause READ
TWICE becomes possible with this patch.

Right it might happen, and I was keeping that in mind, but it's not
specific to this patch. It won't reload core io_uring bits, and all
fields cmds use already have this problem.

Unless there is a better option, the direction we'll be moving in is
adding a preparation step that should read and stash parts of SQE
it cares about, which should also make full SQE copy not
needed / optional.

If you have an example of the two copes flow, that would be great.

Not any example yet, but also not see any access on cmd->sqe(except for cmd_op)
in your patches too.

--
Pavel Begunkov