Re: [PATCH v2] blk_iocost: remove some duplicate irq disable/enables

From: Jens Axboe
Date: Thu Oct 03 2024 - 12:25:17 EST


On 10/3/24 9:49 AM, Waiman Long wrote:
> On 10/3/24 10:38, Jens Axboe wrote:
>> On 10/3/24 8:31 AM, Dan Carpenter wrote:
>>> On Thu, Oct 03, 2024 at 07:21:25AM -0600, Jens Axboe wrote:
>>>> On 10/3/24 6:03 AM, Dan Carpenter wrote:
>>>>> 3117 ioc_now(iocg->ioc, &now);
>>>>> 3118 weight_updated(iocg, &now);
>>>>> 3119 spin_unlock(&iocg->ioc->lock);
>>>>> 3120 }
>>>>> 3121 }
>>>>> 3122 spin_unlock_irq(&blkcg->lock);
>>>>> 3123
>>>>> 3124 return nbytes;
>>>>> 3125 }
>>>>> 3126
>>>>> 3127 blkg_conf_init(&ctx, buf);
>>>>> 3128
>>>>> 3129 ret = blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx);
>>>>> 3130 if (ret)
>>>>> 3131 goto err;
>>>>> 3132
>>>>> 3133 iocg = blkg_to_iocg(ctx.blkg);
>>>>> 3134
>>>>> 3135 if (!strncmp(ctx.body, "default", 7)) {
>>>>> 3136 v = 0;
>>>>> 3137 } else {
>>>>> 3138 if (!sscanf(ctx.body, "%u", &v))
>>>>> 3139 goto einval;
>>>>> 3140 if (v < CGROUP_WEIGHT_MIN || v > CGROUP_WEIGHT_MAX)
>>>>> 3141 goto einval;
>>>>> 3142 }
>>>>> 3143
>>>>> 3144 spin_lock(&iocg->ioc->lock);
>>>>>
>>>>> But why is this not spin_lock_irq()? I haven't analyzed this so maybe it's
>>>>> fine.
>>>> That's a bug.
>>>>
>>> I could obviously write this patch but I feel stupid writing the
>>> commit message. My level of understanding is Monkey See Monkey do.
>>> Could you take care of this?
>> Sure - or let's add Tejun who knows this code better. Ah he's already
>> added. Tejun?
>>
>>> So somewhere we're taking a lock in the IRQ handler and this can lead
>>> to a deadlock? I thought this would have been caught by lockdep?
>> It's nested inside blkcg->lock which is IRQ safe, that is enough. But
>> doing a quick scan of the file, the usage is definitely (widly)
>> inconsistent. Most times ioc->lock is grabbed disabling interrupts, but
>> there are also uses that doesn't disable interrupts, coming from things
>> like seq_file show paths which certainly look like they need it. lockdep
>> should certainly warn about this, only explanation I have is that nobody
>> bothered to do that :-)
>
> The lockdep validator will only warn about this if a debug kernel with
> lockdep enabled has run a workload that exercises all the relevant
> locking sequences that can implicate a potential for deadlock.

Sure that's obvious, but there are quite a few easy ones in there, so
seems like it should be easy to trigger. It's not like it's only some
odd path, the irq on/off looks trivial.

--
Jens Axboe