Re: [PATCH v4] scsi: ufs: core: Don't perform UFS clkscale if host asyn scan in progress
From: Ziqi Chen
Date: Mon Jul 28 2025 - 23:03:25 EST
On 7/28/2025 2:34 PM, Peter Wang (王信友) wrote:
On Fri, 2025-07-25 at 07:54 -0700, Bart Van Assche wrote:
External email : Please do not click links or open attachments until
you have verified the sender or the content.
On 7/25/25 2:13 AM, Peter Wang (王信友) wrote:
Could consider luns_avail instead mutex?
That would be wrong. I think it is essential that scan_mutex is used
in
this patch. Additionally, the lock inversion is between devfreq->lock
and (c->notifiers)->rwsem so it seems unlikely to me that Ziqi's
patch
is the patch that introduced the reported lock inversion.
Bart.
Hi Bart,
This is a complex situation involving six locks, which may result in
a circular locking dependency.
Let me explain how a new circular locking dependency is formed:
CPU0: take &(c->notifiers)->rwsem#2, wait &devfreq->lock
CPU1: take &devfreq->lock, wait &shost->scan_mutex, <= Lock introduced
by this patch
CPU2: take &shost->scan_mutex, wait &q->sysfs_lock
CPU3: take &q->sysfs_lock, wait cpu_hotplug_lock
Hi Peter,
I Don't think the dependence between CPU2 and CPU3 would happen.
CPU2:
__mutex_lock_common+0x1dc/0x371c -> (Waiting &q->sysfs_lock)
mutex_lock_nested+0x2c/0x38
blk_mq_realloc_hw_ctxs+0x94/0x9cc
blk_mq_init_allocated_queue+0x31c/0x1020
blk_mq_alloc_queue+0x130/0x214
scsi_alloc_sdev+0x708/0xad4
scsi_probe_and_add_lun+0x20c/0x27b4
CPU3:
pus_read_lock+0x54/0x1e8 -> ( Waiting cpu_hotplug_lock)
__cpuhp_state_add_instance+0x24/0x54
blk_mq_alloc_and_init_hctx+0x940/0xbec
blk_mq_realloc_hw_ctxs+0x290/0x9cc -> (holding &q->sysfs_lock)
blk_mq_init_allocated_queue+0x31c/0x1020
__blk_mq_alloc_disk+0x138/0x2b0
loop_add+0x2ac/0x840
loop_init+0xe8/0x10c
As my understanding, on single sdev , alloc_disk() and alloc_queue()
is synchronous. On multi sdev , they hold different &q->sysfs_lock
as they would be allocated different request_queue.
In addition to above , if you check the latest version, the function
blk_mq_realloc_hw_ctxs has been changed many times recently. It doesn't
hold &q->sysfs_lock any longer.
https://lore.kernel.org/all/20250304102551.2533767-5-nilay@xxxxxxxxxxxxx/
-> use &q->elevator_lock instead of &q->sysfs_lock.
https://lore.kernel.org/all/20250403105402.1334206-1-ming.lei@xxxxxxxxxx/
-> Don't use &q->elevator_lock in blk_mq_init_allocated_queue context.
CPU4: take cpu_hotplug_lock, wait subsys mutex#2 > CPU5: take subsys mutex#2, wait &(c->notifiers)->rwsem#2 <= Hold By
CPU0
ufshcd_add_lus triggers ufshcd_devfreq_init.
This means that clock scaling can be performed while scanning LUNs.
However, this patch adds another lock to prevent clock scaling
before the LUN scan is complete. This is a paradoxical situation.
If we cannnot do clock scaling before the LUN scan is complete,
then why we start clock scaling before it?
If we don’t put it in luns_avail (start clock scaling after LUNs
scan complete), do you have a better suggestion
for where to initialize clock scaling (ufshcd_devfreq_init)?
I have also considered this. you can see my old version of this patch
(patch V2), I moved ufshcd_devfreq_init() out of ufshcd_add_lus().
But due to ufshcd_add_lus() is async, even through move it out , we
still can not ensure clock scaling be triggered after all lUs probed.
BRs
Ziqi
>
Thanks.
Peter