Re: [PATCH] blk-iocost: use irq-safe locking in cgroup handlers

From: Yu Kuai

Date: Tue Jun 02 2026 - 12:14:00 EST


Hi,

在 2026/6/2 23:56, Yu Kuai 写道:
> Hi,
>
> 在 2026/6/2 21:25, Jens Axboe 写道:
>> On 6/1/26 3:50 PM, Bart Van Assche wrote:
>>> On 5/31/26 11:13 PM, Yu Kuai wrote:
>>>> @@ -3378,14 +3378,14 @@ static u64 ioc_cost_model_prfill(struct seq_file *sf,
>>>>       if (!dname)
>>>>           return 0;
>>>>   -    spin_lock(&ioc->lock);
>>>> +    spin_lock_irq(&ioc->lock);
>>>>       seq_printf(sf, "%s ctrl=%s model=linear "
>>>>              "rbps=%llu rseqiops=%llu rrandiops=%llu "
>>>>              "wbps=%llu wseqiops=%llu wrandiops=%llu\n",
>>>>              dname, ioc->user_cost_model ? "user" : "auto",
>>>>              u[I_LCOEF_RBPS], u[I_LCOEF_RSEQIOPS], u[I_LCOEF_RRANDIOPS],
>>>>              u[I_LCOEF_WBPS], u[I_LCOEF_WSEQIOPS], u[I_LCOEF_WRANDIOPS]);
>>>> -    spin_unlock(&ioc->lock);
>>>> +    spin_unlock_irq(&ioc->lock);
>>>>       return 0;
>>>>   }
>>> This change is wrong. ioc_cost_model_prfill() only has one caller,
>>> namely blkcg_print_blkgs(). blkcg_print_blkgs() calls the above function
>>> with interrupts disabled. The spin_unlock_irq(&ioc->lock) at the end of
>>> the above function enables interrupts while q->queue_lock is held. If an
>>> interrupt happens on the same CPU core before q->queue_lock is unlocked,
>>> and that interrupt tries to lock q->queue_lock, a deadlock will occur.
>> Agree, it's broken. Which makes me suspect of the traces shown. Yu,
>> can you please shed some light on this?
> Looks like my reply is in your spam again :(
>
> The trace is from ioc_weight_write(), which do have the problem. And
> while reviewing related code, I'm wrong to think ioc_cost_model_prfill()
> have the same problem and changed it as well.
>
>> I've dropped it, thanks Bart.
> I'll send a v2, and only fix ioc_weight_write().

I just update the latest branch and try this patch, however I didn't reporduce
the problem. And turns out, blkg_conf_prep() already disable irq by
spin_lock_irq(&q->queue_lock). So there is no problem at all.

The trace I found is because there are some pending patches to convert
protecting blkcg from queue_lock to blkcg_mutex, and the
spin_lock_irq(&q->queue_lock) is removed.

Sorry for the noise, I should have checked if this problem was introduced
by myself first. And thanks Bart to catch it.

>
--
Thansk,
Kuai