mpt3sas_cm2: attempting host reset! scmd(ffff9e8a88623d48

From: Ãzkan GÃksu
Date: Tue Jul 31 2018 - 09:57:29 EST


Hello.

When a disk starts to give I/O error mpt3sas going to reset device first and after the reset if disk still gives "task abort" and "I/O error" then mpt3sas resets the HBA card. Because of the problem I lost 1 HBA(LSI) card "this means 100 disk at same time" for 30-50 second.
I was have the problem since 9-10 months so i found a solution with adding extra timeout to my disks.

--> "for drive in /sys/block/sd*; do echo 180 >Â $drive/device/timeout; done"

After increase disk timeout, for a long time i saw I/O errors but mpt3sas did not reset HBA card and i was thinking this problem solvedÂBut after few months same problem occured some how.
Now i'm looking for a better solution.ÂAny idea please?


OS: Arch Linux 4.14.40-1-lts #1 SMP Wed May 9 13:00:32 CEST 2018 x86_64 GNU/Linux

[Mon Jul 23 05:16:09 2018] mpt3sas_cm2: attempting host reset! scmd(ffff9e8a88623d48)
[Mon Jul 23 05:16:09 2018] mpt3sas_cm2: sending diag reset !!
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: diag reset: SUCCESS
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: LSISAS3008: FWVersion(13.00.00.00), ChipRevision(0x02), BiosVersion(08.35.00.00)
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: Protocol=(
[Mon Jul 23 05:16:10 2018] mpt3sas_cm2: sending port enable !!
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: port enable: SUCCESS
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for end-devices: start
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for end-devices: complete
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for expanders: start
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: search for expanders: complete
[Mon Jul 23 05:16:17 2018] mpt3sas_cm2: host reset: SUCCESS scmd(ffff9e8a88623d48)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices: start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices: end-devices
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices: expanders
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: removing unresponding devices: complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: expanders start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: break from expander scan: ioc_status(0x0022), loginfo(0x310f0400)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: expanders complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: end devices start
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: break from end device scan: ioc_status(0x0022), loginfo(0x310f0400)
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: end devices complete
[Mon Jul 23 05:16:38 2018] mpt3sas_cm2: scan devices: complete