Re: blk-mq + bfq IO hangs after writing partition table
From: Ming Lei
Date: Fri Dec 01 2017 - 08:36:18 EST
On Fri, Dec 01, 2017 at 12:18:25PM +0000, Ivan Kozik wrote:
> On Fri, Dec 1, 2017 at 8:50 AM, Ming Lei <ming.lei@xxxxxxxxxx> wrote:
> > Could you run the following script[1] and provide us the result after
> > the IO hang is triggered?
> >
> > #./dump-blk-info /dev/sdX #/dev/sdX is name of your USB disk
> >
> > [1] http://people.redhat.com/minlei/tests/tools/dump-blk-info
>
> Hi Ming,
>
> Thanks for taking a look. I have pasted and attached (in case of
> gmail corruption) these files:
>
> Samsung-128GB-before: Samsung 128GB USB 3.0 Fit plugged into HP 8460p,
> before gdisk write
>
> Samsung-128GB-after: Samsung 128GB USB 3.0 Fit plugged into HP 8460p,
> after gdisk write
>
> Generic-4GB-after: some awful generic but working 4GB USB flash
> drive plugged into HP 8460p, after gdisk write
>
> I have reproduced this again on the same kernel revision and blk-mq bfq with:
>
> minimized cmdline: scsi_mod.use_blk_mq=y
> gdisk writing the partition table as-is, no changes
>
> and confirmed again that kyber never triggers this.
>
> Thanks,
>
> Ivan
>
>
>
>
> Samsung-128GB-before
> =============sdc/hctx0==================
> $active
> 0
>
> $busy
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu0
> $completed
> 3 14
> $dispatched
> 3 14
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu1
> $completed
> 159 76
> $dispatched
> 159 76
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu2
> $completed
> 84 57
> $dispatched
> 84 57
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu3
> $completed
> 154 54
> $dispatched
> 155 54
> $merged
> 1
> $rq_list
>
> $ctx_map
> 00000000: 00
>
> $dispatch
>
> $dispatched
> 0 8
> 1 701
> 2 0
> 4 0
> 8 0
> 16 0
> 32+ 0
>
> $flags
> alloc_policy=FIFO SHOULD_MERGE|SG_MERGE
>
> $io_poll
> considered=0
> invoked=0
> success=0
>
> $queued
> 602
>
> $run
> 843
>
> $sched_tags
> nr_tags=2
> nr_reserved_tags=0
> active_queues=0
>
> bitmap_tags:
> depth=2
> busy=1
> bits_per_word=64
> map_nr=1
> alloc_hint={0, 1, 0, 0, 0, 1, 1, 0}
> wake_batch=1
> wake_index=0
> ws={
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> }
> round_robin=0
>
> $sched_tags_bitmap
> 00000000: 01
>
> $state
>
>
> $tags
> nr_tags=1
> nr_reserved_tags=0
> active_queues=0
>
> bitmap_tags:
> depth=1
> busy=0
> bits_per_word=64
> map_nr=1
> alloc_hint={0, 0, 0, 0, 0, 0, 0, 0}
> wake_batch=1
> wake_index=0
> ws={
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> }
> round_robin=0
>
> $tags_bitmap
> 00000000: 00
>
>
>
>
> Samsung-128GB-after
> =============sdc/hctx0==================
> $active
> 0
>
> $busy
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu0
> $completed
> 3 14
> $dispatched
> 3 14
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu1
> $completed
> 159 95
> $dispatched
> 159 96
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu2
> $completed
> 84 58
> $dispatched
> 84 58
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu3
> $completed
> 154 54
> $dispatched
> 155 54
> $merged
> 1
> $rq_list
>
> $ctx_map
> 00000000: 00
>
> $dispatch
>
> $dispatched
> 0 8
> 1 721
> 2 0
> 4 0
> 8 0
> 16 0
> 32+ 0
>
> $flags
> alloc_policy=FIFO SHOULD_MERGE|SG_MERGE
>
> $io_poll
> considered=0
> invoked=0
> success=0
>
> $queued
> 623
>
> $run
> 887
>
> $sched_tags
> nr_tags=2
> nr_reserved_tags=0
> active_queues=0
>
> bitmap_tags:
> depth=2
> busy=2
> bits_per_word=64
> map_nr=1
> alloc_hint={0, 0, 0, 0, 0, 1, 1, 0}
> wake_batch=1
> wake_index=0
> ws={
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> }
> round_robin=0
>
> $sched_tags_bitmap
> 00000000: 03
>
> $state
>
>
> $tags
> nr_tags=1
> nr_reserved_tags=0
> active_queues=0
>
> bitmap_tags:
> depth=1
> busy=0
> bits_per_word=64
> map_nr=1
> alloc_hint={0, 0, 0, 0, 0, 0, 0, 0}
> wake_batch=1
> wake_index=0
> ws={
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> }
> round_robin=0
>
> $tags_bitmap
> 00000000: 00
>
>
>
>
> Generic-4GB-after
> =============sdc/hctx0==================
> $active
> 0
>
> $busy
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu0
> $completed
> 94 32
> $dispatched
> 94 33
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu1
> $completed
> 561 113
> $dispatched
> 562 113
> $merged
> 0
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu2
> $completed
> 251 179
> $dispatched
> 251 179
> $merged
> 4
> $rq_list
>
> $/sys/kernel/debug/block/sdc//hctx0/cpu3
> $completed
> 228 63
> $dispatched
> 228 63
> $merged
> 0
> $rq_list
>
> $ctx_map
> 00000000: 00
>
> $dispatch
>
> $dispatched
> 0 21
> 1 1707
> 2 0
> 4 0
> 8 0
> 16 0
> 32+ 0
>
> $flags
> alloc_policy=FIFO SHOULD_MERGE|SG_MERGE
>
> $io_poll
> considered=0
> invoked=0
> success=0
>
> $queued
> 1523
>
> $run
> 2000
>
> $sched_tags
> nr_tags=2
> nr_reserved_tags=0
> active_queues=0
>
> bitmap_tags:
> depth=2
> busy=2
> bits_per_word=64
> map_nr=1
> alloc_hint={0, 0, 0, 0, 1, 1, 0, 0}
> wake_batch=1
> wake_index=0
> ws={
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=active},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
> {.wait_cnt=1, .wait=inactive},
Hi Ivan,
Thanks for your post and test!
Basically all IO hang happen inside get_request(), and seems this issue
isn't related with recent change in V4.15, could you run V4.14 to see if
there is such issue?
I guess it might be related with sbitmap_queue's wakeup because SCSI
always run queue after one request is completed, and there isn't any
request in hctx->dispatch, and for all requests in scheduler queue,
there is always chance to dispatch them from SCSI's restart(scsi_end_request).
Maybe need Jens/Omar to take a look.
Thanks,
Ming