Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

From: Christian Borntraeger
Date: Wed Dec 20 2017 - 10:47:34 EST


On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> On 07.12.2017 00:29, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068ÂÂÂ -> bad
>>> ÂÂÂÂ blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bccÂÂÂ -> good
>>> ÂÂÂ genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and adding Martin (for the
>>> s390 irq handling code).
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> I tried this on my system and the blk-mq-hotplug-fix branch does not boot for me as well.
> The disks get up and running and I/O works fine. At least the partition detection and EXT4-fs mount works.
>
> But at some point in time the disk do not get any requests.
>
> I currently have no clue why.
> I took a dump and had a look at the disk states and they are fine. No error in the logs or in our debug entrys. Just empty DASD devices waiting to be called for I/O requests.
>
> Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O hangs on DASDs (even
without hotplug).

Christian