Re: Loopback device hung [was Re: xfs deadlock on 3.9-rc5 runningxfstests case #78]

From: Jens Axboe
Date: Wed Apr 03 2013 - 07:41:58 EST


On Tue, Apr 02 2013, Jens Axboe wrote:
> On Tue, Apr 02 2013, CAI Qian wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Jens Axboe" <axboe@xxxxxxxxx>
> > > To: "CAI Qian" <caiqian@xxxxxxxxxx>
> > > Cc: "Dave Chinner" <david@xxxxxxxxxxxxx>, xfs@xxxxxxxxxxx, "LKML" <linux-kernel@xxxxxxxxxxxxxxx>
> > > Sent: Tuesday, April 2, 2013 5:00:47 PM
> > > Subject: Re: Loopback device hung [was Re: xfs deadlock on 3.9-rc5 running xfstests case #78]
> > >
> > > On Tue, Apr 02 2013, CAI Qian wrote:
> > > >
> > > >
> > > > ----- Original Message -----
> > > > > From: "Jens Axboe" <axboe@xxxxxxxxx>
> > > > > To: "Dave Chinner" <david@xxxxxxxxxxxxx>
> > > > > Cc: "CAI Qian" <caiqian@xxxxxxxxxx>, xfs@xxxxxxxxxxx, "LKML"
> > > > > <linux-kernel@xxxxxxxxxxxxxxx>
> > > > > Sent: Tuesday, April 2, 2013 3:30:35 PM
> > > > > Subject: Re: Loopback device hung [was Re: xfs deadlock on 3.9-rc5
> > > > > running xfstests case #78]
> > > > >
> > > > > On Tue, Apr 02 2013, Jens Axboe wrote:
> > > > > > On Tue, Apr 02 2013, Dave Chinner wrote:
> > > > > > > [Added jens Axboe to CC]
> > > > > > >
> > > > > > > On Tue, Apr 02, 2013 at 02:08:49AM -0400, CAI Qian wrote:
> > > > > > > > Saw on almost all the servers range from x64, ppc64 and s390x with
> > > > > > > > kernel
> > > > > > > > 3.9-rc5 and xfsprogs-3.1.10. Never caught this in 3.9-rc4, so looks
> > > > > > > > like
> > > > > > > > something new broke this. Log is here with sysrq debug info.
> > > > > > > > http://people.redhat.com/qcai/stable/log
> > > > > >
> > > > > > CAI Qian, can you try and back the below out and test again?
> > > > >
> > > > > Nevermind, it's clearly that one. The below should improve the
> > > > > situation, but it's not pretty. A better fix would be to allow
> > > > > auto-deletion even if PART_NO_SCAN is set.
> > > > Jens, when compiled the mainline (up to fefcdbe) with this patch,
> > > > it error-ed out,
> > >
> > > Looks like I sent the wrong one, updated below.
> > The patch works well. Thanks!
>
> Thanks for testing! I don't particularly like this stuff in loop,
> though. It's quite nasty and depends on other behaviour. It would be
> prettier if we just had rescan_partitions() do the right thing, and only
> drop partitions and not rescan if NO_PART_SCAN is set.
>
> Ala the below, dropping the loop change and implementing that change in
> the core code. Phillip, can you check whether this does the right thing
> for your bug too?

Phillip? I'm going to revert the loop change asap, so if you want this
fixed for 3.10, it's about that time to test it out.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/