Re: [PATCH] scsi: ufs: core: move some irq handling back to hardirq (with time limit)

From: Neil Armstrong
Date: Mon Jul 28 2025 - 10:42:14 EST


On 28/07/2025 16:39, Manivannan Sadhasivam wrote:
On Mon, Jul 28, 2025 at 08:06:21PM GMT, Manivannan Sadhasivam wrote:
+ Nitin


Really added Nitin now.

BTW what about MCQ on SM8650 ? it's probably the real fix here...

Neil


On Thu, Jul 24, 2025 at 02:38:30PM GMT, André Draszik wrote:
On Thu, 2025-07-24 at 13:54 +0200, Neil Armstrong wrote:
On 24/07/2025 13:44, André Draszik wrote:
On Thu, 2025-07-24 at 10:54 +0100, André Draszik wrote:
fio results on Pixel 6:
   read / 1 job     original    after    this commit
     min IOPS        4,653.60   2,704.40    3,902.80
     max IOPS        6,151.80   4,847.60    6,103.40
     avg IOPS        5,488.82   4,226.61    5,314.89
     cpu % usr           1.85       1.72        1.97
     cpu % sys          32.46      28.88       33.29
     bw MB/s            21.46      16.50       20.76

   read / 8 jobs    original    after    this commit
     min IOPS       18,207.80  11,323.00   17,911.80
     max IOPS       25,535.80  14,477.40   24,373.60
     avg IOPS       22,529.93  13,325.59   21,868.85
     cpu % usr           1.70       1.41        1.67
     cpu % sys          27.89      21.85       27.23
     bw MB/s            88.10      52.10       84.48

   write / 1 job    original    after    this commit
     min IOPS        6,524.20   3,136.00    5,988.40
     max IOPS        7,303.60   5,144.40    7,232.40
     avg IOPS        7,169.80   4,608.29    7,014.66
     cpu % usr           2.29       2.34        2.23
     cpu % sys          41.91      39.34       42.48
     bw MB/s            28.02      18.00       27.42

   write / 8 jobs   original    after    this commit
     min IOPS       12,685.40  13,783.00   12,622.40
     max IOPS       30,814.20  22,122.00   29,636.00
     avg IOPS       21,539.04  18,552.63   21,134.65
     cpu % usr           2.08       1.61        2.07
     cpu % sys          30.86      23.88       30.64
     bw MB/s            84.18      72.54       82.62

Given the severe performance drop introduced by the culprit
commit, it might make sense to instead just revert it for
6.16 now, while this patch here can mature and be properly
reviewed. At least then 6.16 will not have any performance
regression of such a scale.

The original change was designed to stop the interrupt handler
to starve the system and create display artifact and cause
timeouts on system controller submission. While imperfect,
it would require some fine tuning for smaller controllers
like on the Pixel 6 that when less queues.

Well, the patch has solved one problem by creating another problem.
I don't think that's how things are normally done. A 40% bandwidth
and IOPS drop is not negligible.

And while I am referencing Pixel 6 above as it's the only device
I have available to test, I suspect all < v4 controllers / devices
are affected in a similar way, given the nature of the change.


IMO we should just revert the offending commit for 6.16 and see how to properly
implement it in the next release. Even with this series, we are not on par with
the original IOPS, which is bad for everyone.

- Mani

--
மணிவண்ணன் சதாசிவம்