[PATCH] null_blk: don't enable irqs when in irq

From: Rabin Vincent
Date: Fri Dec 25 2015 - 09:27:58 EST


When using irq_mode=NULL_IRQ_TIMER, blk_start_queue() is called from the
hrtimer interrupt. null_request_fn() calls spin_unlock_irq() and this
causes the following splat from lockdep since interrupts are being
enabled while we're in an interrupt handler.

When we're in null_request_fn() we can't know the status of the flags
saved before blk_start_queue() was called, but we can use in_irq() to
conditionally enable interrupts only if we're not in a hard interrupt in
order to handle this case.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 398 at kernel/locking/lockdep.c:2608 trace_hardirqs_on_caller+0x11a/0x1b0()
DEBUG_LOCKS_WARN_ON(current->hardirq_context)
CPU: 0 PID: 398 Comm: mkfs.ext4 Not tainted 4.4.0-rc6+ #77
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
Call Trace:
<IRQ> [<ffffffff8134467c>] dump_stack+0x4e/0x82
[<ffffffff810501e2>] warn_slowpath_common+0x82/0xc0
[<ffffffff81560aac>] ? _raw_spin_unlock_irq+0x2c/0x60
[<ffffffff813da2b0>] ? null_softirq_done_fn+0x30/0x30
[<ffffffff8105026c>] warn_slowpath_fmt+0x4c/0x50
[<ffffffff8109cdaa>] trace_hardirqs_on_caller+0x11a/0x1b0
[<ffffffff8109ce4d>] trace_hardirqs_on+0xd/0x10
[<ffffffff81560aac>] _raw_spin_unlock_irq+0x2c/0x60
[<ffffffff813da31e>] null_request_fn+0x4e/0xb0
[<ffffffff8131dbc3>] __blk_run_queue+0x33/0x40
[<ffffffff8131de1f>] blk_start_queue+0x3f/0x80
[<ffffffff813da277>] end_cmd+0x117/0x120
[<ffffffff813da2c2>] null_cmd_timer_expired+0x12/0x20
[<ffffffff810bbaab>] __hrtimer_run_queues+0x12b/0x4b0
[<ffffffff810bc69f>] hrtimer_interrupt+0xaf/0x1b0
[<ffffffff810361f6>] local_apic_timer_interrupt+0x36/0x60
[<ffffffff81563ced>] smp_apic_timer_interrupt+0x3d/0x50
[<ffffffff8156217c>] apic_timer_interrupt+0x8c/0xa0
<EOI> [<ffffffff81344c9b>] ? fprop_fraction_percpu+0xeb/0x110
[<ffffffff8112bc8f>] ? __wb_calc_thresh+0x2f/0xc0
[<ffffffff8112bc8f>] __wb_calc_thresh+0x2f/0xc0
[<ffffffff8112bb8c>] ? domain_dirty_limits+0x1bc/0x1f0
[<ffffffff8112db85>] balance_dirty_pages_ratelimited+0x6d5/0xfb0
[<ffffffff810b0967>] ? rcu_read_lock_sched_held+0x77/0x90
[<ffffffff811a2c7a>] ? __block_commit_write.isra.1+0x7a/0xb0
[<ffffffff8112224c>] generic_perform_write+0x14c/0x1c0
[<ffffffff81123310>] __generic_file_write_iter+0x190/0x1f0
[<ffffffff811a763b>] blkdev_write_iter+0x7b/0x100
[<ffffffff8116b46a>] __vfs_write+0xaa/0xe0
[<ffffffff8116b875>] vfs_write+0x95/0x100
[<ffffffff81188a6f>] ? __fget_light+0x6f/0x90
[<ffffffff8116c037>] SyS_pwrite64+0x77/0x90
[<ffffffff815613b2>] entry_SYSCALL_64_fastpath+0x12/0x76
---[ end trace 39b7df36fb237be1 ]---

Signed-off-by: Rabin Vincent <rabin@xxxxxx>
---
drivers/block/null_blk.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index a428e4e..d16c5dc 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -342,13 +342,20 @@ static int null_rq_prep_fn(struct request_queue *q, struct request *req)
static void null_request_fn(struct request_queue *q)
{
struct request *rq;
+ bool irq = in_irq();

while ((rq = blk_fetch_request(q)) != NULL) {
struct nullb_cmd *cmd = rq->special;

- spin_unlock_irq(q->queue_lock);
+ if (irq)
+ spin_unlock(q->queue_lock);
+ else
+ spin_unlock_irq(q->queue_lock);
null_handle_cmd(cmd);
- spin_lock_irq(q->queue_lock);
+ if (irq)
+ spin_lock(q->queue_lock);
+ else
+ spin_lock_irq(q->queue_lock);
}
}

--
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/