Re: [PATCH] io_uring/rsrc: Simplify buffer cloning by locking both rings

From: Jens Axboe
Date: Wed Jan 15 2025 - 15:22:36 EST


On 1/15/25 1:20 PM, Jann Horn wrote:
> On Wed, Jan 15, 2025 at 6:18?PM Jens Axboe <axboe@xxxxxxxxx> wrote:
>> On 1/15/25 9:25 AM, Jann Horn wrote:
>>> The locking in the buffer cloning code is somewhat complex because it goes
>>> back and forth between locking the source ring and the destination ring.
>>>
>>> Make it easier to reason about by locking both rings at the same time.
>>> To avoid ABBA deadlocks, lock the rings in ascending kernel address order,
>>> just like in lock_two_nondirectories().
>>>
>>> Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
>>> ---
>>> Just an idea for how I think io_clone_buffers() could be changed so it
>>> becomes slightly easier to reason about.
>>> I left the out_unlock jump label with its current name for now, though
>>> I guess that should probably be adjusted.
>>
>> Looks pretty clean to me, and does make it easier to reason about. Only
>> thing that stuck out to me was:
>>
>>> @@ -1067,7 +1060,18 @@ int io_register_clone_buffers(struct io_ring_ctx *ctx, void __user *arg)
>>> file = io_uring_register_get_file(buf.src_fd, registered_src);
>>> if (IS_ERR(file))
>>> return PTR_ERR(file);
>>> - ret = io_clone_buffers(ctx, file->private_data, &buf);
>>> + src_ctx = file->private_data;
>>> + if (src_ctx == ctx) {
>>> + ret = -ELOOP;
>>> + goto out_put;
>>> + }
>>
>> which is a change, as previously it would've been legal to do something ala:
>>
>> struct io_uring ring;
>> struct iovec vecs[2];
>>
>> vecs[0] = real_buffer;
>> vecs[1] = sparse_buffer;
>>
>> io_uring_register_buffers(&ring, vecs, 2);
>>
>> io_uring_clone_buffers_offset(&ring, &ring, 1, 0, 1, IORING_REGISTER_DST_REPLACE);
>>
>> and clone vecs[0] into slot 1. With the patch, that'll return -ELOOP instead.
>>
>> Maybe something like the below incremental, to just make the unlock +
>> double lock depending on whether they are different or not? And also
>> cleaning up the label naming at the same time.
>
> Yeah, looks good to me. If nobody else has review feedback, do you
> want to fold that in locally? If there's more feedback, I'll fold that
> incremental into my v2.

If you want to send off a v2, just fold it in. That would be the most
appropriate imho, rather than me modifying your patch :)

--
Jens Axboe