Re: [PATCH] pciehp: Fix race condition handling surprise link-down

From: Bjorn Helgaas
Date: Thu Mar 09 2017 - 09:46:30 EST


On Wed, Mar 08, 2017 at 04:27:26AM -0800, Raj, Ashok wrote:
> On Mon, Mar 06, 2017 at 06:24:17PM -0600, Bjorn Helgaas wrote:
> > On Fri, Feb 03, 2017 at 10:51:04AM -0600, Bjorn Helgaas wrote:
> >
> > Hi Ashok,
> >
> > Just a ping to make sure we're not deadlocked. I'm waiting for you,
> > so I hope you're not also waiting for me :) I'm not trying to rush you;
> > I just don't want to drop this by mistake.
> >
> Hi Bjorn
>
> no we aren't deadlocked :-). I didn't get around changing it to ordered
> queue yet, mostly worried about having to retest all the different
> combinations with ATTN, POWER_CTL, SLD.
>
> I'm depending on other folks to test SLD. They are tied up with other
> issues ATM.
>
> I have had another OEM test with several disks and multiple ATTN's
> pressed/cancel and current code seems to be working well so far, except the
> SLD case.
>
> The change in the patch was only ensuring that we don't start another
> POWER_ON or POWER_OFF before the earlier operation was complete.
>
> Would it be alright to fix SLD with this version while we can probe a clean
> approach that can give us sufficient time to test a clean approach that works
> with all the different combinations and OEM systems?

This version avoids some SLD issues, but we can't guarantee that it is
a complete solution.

I don't really like to put in an incomplete solution because it
reduces the urgency for doing a proper fix, and it also complicates
debugging if we trip over an SLD issue we haven't seen yet during
testing.

Bjorn