Re: [5.2-rc1 regression]: nvme vs. hibernation
From: Jiri Kosina
Date: Fri May 24 2019 - 18:30:25 EST
On Fri, 24 May 2019, Keith Busch wrote:
> > Something is broken in Linus' tree (4dde821e429) with respec to
> > hibernation on my thinkpad x270, and it seems to be nvme related.
> > I reliably see the warning below during hibernation, and then sometimes
> > resume sort of works but the machine misbehaves here and there (seems like
> > lost IRQs), sometimes it never comes back from the hibernated state.
> > I will not have too much have time to look into this over weekend, so I am
> > sending this out as-is in case anyone has immediate idea. Otherwise I'll
> > bisect it on monday (I don't even know at the moment what exactly was the
> > last version that worked reliably, I'll have to figure that out as well
> > later).
> I believe the warning call trace was introduced when we converted nvme to
> lock-less completions. On device shutdown, we'll check queues for any
> pending completions, and we temporarily disable the interrupts to make
> sure that queues interrupt handler can't run concurrently.
Yeah, the completion changes were the primary reason why I brought this up
with all of you guys in CC.
> On hibernation, most CPUs are offline, and the interrupt re-enabling
> is hitting this warning that says the IRQ is not associated with any
> online CPUs.
> I'm sure we can find a way to fix this warning, but I'm not sure that
> explains the rest of the symptoms you're describing though.
It seems to be more or less reliable enough for bisect. I'll try that on
monday and will let you know.