Re: [PATCH RFC v4 1/3] block: add BIO_COMPLETE_IN_TASK for task-context completion

From: Jens Axboe

Date: Wed Apr 08 2026 - 15:54:41 EST


On 4/8/26 12:48 PM, Tal Zussman wrote:
> On 3/25/26 4:14 PM, Jens Axboe wrote:
>> On 3/25/26 12:43 PM, Tal Zussman wrote:
>>> +static void bio_complete_work_fn(struct work_struct *w)
>>> +{
>>> + struct bio_complete_batch *batch;
>>> + struct bio_list list;
>>> +
>>> +again:
>>> + local_lock_irq(&bio_complete_batch.lock);
>>> + batch = this_cpu_ptr(&bio_complete_batch);
>>> + list = batch->list;
>>> + bio_list_init(&batch->list);
>>> + local_unlock_irq(&bio_complete_batch.lock);
>>> +
>>> + while (!bio_list_empty(&list)) {
>>> + struct bio *bio = bio_list_pop(&list);
>>> + bio->bi_end_io(bio);
>>> + }
>>> +
>>> + local_lock_irq(&bio_complete_batch.lock);
>>> + batch = this_cpu_ptr(&bio_complete_batch);
>>> + if (!bio_list_empty(&batch->list)) {
>>> + local_unlock_irq(&bio_complete_batch.lock);
>>> +
>>> + if (!need_resched())
>>> + goto again;
>>> +
>>> + schedule_work_on(smp_processor_id(), &batch->work);
>>> + return;
>>> + }
>>> + local_unlock_irq(&bio_complete_batch.lock);
>>> +}
>>
>> bool looped = false;
>>
>> do {
>> if (looped && need_resched()) {
>> schedule_work_on(smp_processor_id(), &batch->work);
>> break;
>> }
>>
>> local_lock_irq(&bio_complete_batch.lock);
>> batch = this_cpu_ptr(&bio_complete_batch);
>> list = batch->list;
>> bio_list_init(&batch->list);
>> local_unlock_irq(&bio_complete_batch.lock);
>>
>> if (bio_list_empty(&list))
>> break;
>>
>> do {
>> struct bio *bio = bio_list_pop(&list);
>> bio->bi_end_io(bio);
>> } while (!bio_list_empty(&list));
>> looped = true;
>> } while (1);
>>
>> would be a lot easier to read, and avoid needing the list manipulation
>> included twice.
>
> Yep, that looks cleaner. Although do we really need the looped variable?
> Can't we just move the need_resched() check right before the while (1)?

If you do that, then you'd also want to check if the list is empty. You
don't want to schedule_work() for a potentially empty list. Either way,
you need some check.

>>> +static void bio_queue_completion(struct bio *bio)
>>> +{
>>> + struct bio_complete_batch *batch;
>>> + unsigned long flags;
>>> +
>>> + local_lock_irqsave(&bio_complete_batch.lock, flags);
>>> + batch = this_cpu_ptr(&bio_complete_batch);
>>> + bio_list_add(&batch->list, bio);
>>> + local_unlock_irqrestore(&bio_complete_batch.lock, flags);
>>> +
>>> + schedule_work_on(smp_processor_id(), &batch->work);
>>> +}
>>
>> Maybe do something ala:
>>
>> static void bio_queue_completion(struct bio *bio)
>> {
>> struct bio_complete_batch *batch;
>> unsigned long flags;
>> bool was_empty;
>>
>> local_lock_irqsave(&bio_complete_batch.lock, flags);
>> batch = this_cpu_ptr(&bio_complete_batch);
>> was_empty = bio_list_empty(&batch->list);
>> bio_list_add(&batch->list, bio);
>> local_unlock_irqrestore(&bio_complete_batch.lock, flags);
>>
>> if (was_empty)
>> schedule_work_on(smp_processor_id(), &batch->work);
>> }
>
> Makes sense, will do!
>
>> Outside of these mostly nits, I like this approach. It avoids my main
>> worry with this, which was contention on the list locks. And on the
>> io_uring side, we'll never hit the !in_task() path anyway, as the
>> completions are run from the task always. The bio flag makes sense for
>> this.
>
> Thanks! I'm going to give Dave's llist suggestion a shot on top of
> this as it seems like it'll simplify this nicely. Looks like that'll
> involve turning bio::bi_next into a union with a struct llist_node.

Since these lists can get long, I'd keep an eye on llist reversal
overhead there...

--
Jens Axboe