Re: [PATCH v3 4/4] io_uring: add support for zone-append

From: Jens Axboe
Date: Wed Jul 08 2020 - 10:54:13 EST


On 7/8/20 6:58 AM, Kanchan Joshi wrote:
> On Tue, Jul 07, 2020 at 04:37:55PM -0600, Jens Axboe wrote:
>> On 7/7/20 4:18 PM, Matthew Wilcox wrote:
>>> On Tue, Jul 07, 2020 at 02:40:06PM -0600, Jens Axboe wrote:
>>>>>> so we have another 24 bytes before io_kiocb takes up another cacheline.
>>>>>> If that's a serious problem, I have an idea about how to shrink struct
>>>>>> kiocb by 8 bytes so struct io_rw would have space to store another
>>>>>> pointer.
>>>>> Yes, io_kiocb has room. Cache-locality wise whether that is fine or
>>>>> it must be placed within io_rw - I'll come to know once I get to
>>>>> implement this. Please share the idea you have, it can come handy.
>>>>
>>>> Except it doesn't, I'm not interested in adding per-request type fields
>>>> to the generic part of it. Before we know it, we'll blow past the next
>>>> cacheline.
>>>>
>>>> If we can find space in the kiocb, that'd be much better. Note that once
>>>> the async buffered bits go in for 5.9, then there's no longer a 4-byte
>>>> hole in struct kiocb.
>>>
>>> Well, poot, I was planning on using that. OK, how about this:
>>
>> Figured you might have had your sights set on that one, which is why I
>> wanted to bring it up upfront :-)
>>
>>> +#define IOCB_NO_CMPL (15 << 28)
>>>
>>> struct kiocb {
>>> [...]
>>> - void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
>>> + loff_t __user *ki_uposp;
>>> - int ki_flags;
>>> + unsigned int ki_flags;
>>>
>>> +typedef void ki_cmpl(struct kiocb *, long ret, long ret2);
>>> +static ki_cmpl * const ki_cmpls[15];
>>>
>>> +void ki_complete(struct kiocb *iocb, long ret, long ret2)
>>> +{
>>> + unsigned int id = iocb->ki_flags >> 28;
>>> +
>>> + if (id < 15)
>>> + ki_cmpls[id](iocb, ret, ret2);
>>> +}
>>>
>>> +int kiocb_cmpl_register(void (*cb)(struct kiocb *, long, long))
>>> +{
>>> + for (i = 0; i < 15; i++) {
>>> + if (ki_cmpls[id])
>>> + continue;
>>> + ki_cmpls[id] = cb;
>>> + return id;
>>> + }
>>> + WARN();
>>> + return -1;
>>> +}
>>
>> That could work, we don't really have a lot of different completion
>> types in the kernel.
>
> Thanks, this looks sorted.

Not really, someone still needs to do that work. I took a quick look, and
most of it looks straight forward. The only potential complication is
ocfs2, which does a swap of the completion for the kiocb. That would just
turn into an upper flag swap. And potential sync kiocb with NULL
ki_complete. The latter should be fine, I think we just need to reserve
completion nr 0 for being that.

--
Jens Axboe