Re: Freezable workqueue blocks non-freezable workqueue during the system resume process

From: Jan Kara
Date: Thu Mar 03 2016 - 04:32:57 EST


Hello,

On Wed 02-03-16 11:00:58, Tejun Heo wrote:
> On Fri, Feb 26, 2016 at 02:19:20PM +0800, Peter Chen wrote:
> > On Thu, Feb 25, 2016 at 05:01:12PM -0500, Tejun Heo wrote:
> > > Hello, Peter.
> > >
> > > On Wed, Feb 24, 2016 at 03:24:30PM +0800, Peter Chen wrote:
> > > > > You might want to complain to the block-layer people about this. I
> > > > > don't know if anything can be done to fix it.
> > > > >
> > > > > Or maybe flush_work and flush_delayed_work can be changed to avoid
> > > > > blocking if the workqueue is frozen. Tejun?
> > > > >
> > > >
> > > > I have a patch to show the root cause of this issue.
> > > >
> > > > http://www.spinics.net/lists/linux-usb/msg136815.html
> > >
> > > I don't get it. Why would it deadlock? Shouldn't things get rolling
> > > once the workqueues are thawed?
> >
> > The workqueue writeback can't be thawed due to driver's resume
> > (dpm_complete) is lock nested, and can't be finished.
>
> Ugh... that's nasty. I wonder whether the right thing to do is making
> writeback workers non-freezable. IOs are supposed to be blocked from
> lower layer anyway. Jan, what do you think?

Well no, at least currently IO is not blocked in lower layers AFAIK - for
that you'd need to freeze block devices & filesystems and there are issues
with that (Jiri Kosina was the last one which was trying to make this work
AFAIR). And I think you need to stop writeback (and generally any IO) to be
generated so that it doesn't interact in a strange way with device drivers
being frozen. So IMO until suspend freezes filesystems & devices properly
you have to freeze writeback workqueue.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR