Re: [PATCH v3 4/4] io_uring: add support for zone-append

From: Kanchan Joshi
Date: Wed Jul 08 2020 - 09:40:09 EST


On Tue, Jul 07, 2020 at 04:37:55PM -0600, Jens Axboe wrote:
On 7/7/20 4:18 PM, Matthew Wilcox wrote:
On Tue, Jul 07, 2020 at 02:40:06PM -0600, Jens Axboe wrote:
so we have another 24 bytes before io_kiocb takes up another cacheline.
If that's a serious problem, I have an idea about how to shrink struct
kiocb by 8 bytes so struct io_rw would have space to store another
pointer.
Yes, io_kiocb has room. Cache-locality wise whether that is fine or
it must be placed within io_rw - I'll come to know once I get to
implement this. Please share the idea you have, it can come handy.

Except it doesn't, I'm not interested in adding per-request type fields
to the generic part of it. Before we know it, we'll blow past the next
cacheline.

If we can find space in the kiocb, that'd be much better. Note that once
the async buffered bits go in for 5.9, then there's no longer a 4-byte
hole in struct kiocb.

Well, poot, I was planning on using that. OK, how about this:

Figured you might have had your sights set on that one, which is why I
wanted to bring it up upfront :-)

+#define IOCB_NO_CMPL (15 << 28)

struct kiocb {
[...]
- void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
+ loff_t __user *ki_uposp;
- int ki_flags;
+ unsigned int ki_flags;

+typedef void ki_cmpl(struct kiocb *, long ret, long ret2);
+static ki_cmpl * const ki_cmpls[15];

+void ki_complete(struct kiocb *iocb, long ret, long ret2)
+{
+ unsigned int id = iocb->ki_flags >> 28;
+
+ if (id < 15)
+ ki_cmpls[id](iocb, ret, ret2);
+}

+int kiocb_cmpl_register(void (*cb)(struct kiocb *, long, long))
+{
+ for (i = 0; i < 15; i++) {
+ if (ki_cmpls[id])
+ continue;
+ ki_cmpls[id] = cb;
+ return id;
+ }
+ WARN();
+ return -1;
+}

That could work, we don't really have a lot of different completion
types in the kernel.

Thanks, this looks sorted.
The last thing is about the flag used to trigger this processing. Will it be fine to intoduce new flag (RWF_APPEND2 or RWF_APPEND_OFFSET)
instead of using RWF_APPEND?
New flag will do what RWF_APPEND does and will also return the written-location (and therefore expects pointer setup in application).