[PATCH 0/3] fix interrupt swamp in NVMe

From: longli
Date: Tue Aug 20 2019 - 02:14:51 EST


From: Long Li <longli@xxxxxxxxxxxxx>

This patch set tries to fix interrupt swamp in NVMe devices.

On large systems with many CPUs, a number of CPUs may share one NVMe hardware
queue. It may have this situation where several CPUs are issuing I/Os, and
all the I/Os are returned on the CPU where the hardware queue is bound to.
This may result in that CPU swamped by interrupts and stay in interrupt mode
for extended time while other CPUs continue to issue I/O. This can trigger
Watchdog and RCU timeout, and make the system unresponsive.

This patch set addresses this by enforcing scheduling and throttling I/O when
CPU is starved in this situation.

Long Li (3):
sched: define a function to report the number of context switches on a
CPU
sched: export idle_cpu()
nvme: complete request in work queue on CPU with flooded interrupts

drivers/nvme/host/core.c | 57 +++++++++++++++++++++++++++++++++++++++-
drivers/nvme/host/nvme.h | 1 +
include/linux/sched.h | 2 ++
kernel/sched/core.c | 7 +++++
4 files changed, 66 insertions(+), 1 deletion(-)

--
2.17.1