Re: System hangs if NVMe/SSD is removed during suspend

From: Jens Axboe
Date: Fri Oct 04 2019 - 09:32:45 EST


On 10/4/19 5:01 AM, Mika Westerberg wrote:
> On Fri, Oct 04, 2019 at 11:59:26AM +0200, Rafael J. Wysocki wrote:
>> On Friday, October 4, 2019 10:03:40 AM CEST Mika Westerberg wrote:
>>> On Thu, Oct 03, 2019 at 09:50:33AM -0700, Tejun Heo wrote:
>>>> Hello, Mika.
>>>>
>>>> On Wed, Oct 02, 2019 at 03:21:36PM +0300, Mika Westerberg wrote:
>>>>> but from that discussion I don't see more generic solution to be
>>>>> implemented.
>>>>>
>>>>> Any ideas we should fix this properly?
>>>>
>>>> Yeah, the only fix I can think of is not using freezable wq. It's
>>>> just not a good idea and not all that difficult to avoid using.
>>>
>>> OK, thanks.
>>>
>>> In that case I will just make a patch that removes WQ_FREEZABLE from
>>> bdi_wq and see what people think about it :)
>>
>> I guess that depends on why WQ_FREEZABLE was added to it in the first place. :-)
>>
>> The reason might be to avoid writes to persistent storage after creating an
>> image during hibernation, since wqs remain frozen throughout the entire
>> hibernation including the image saving phase.
>
> Good point.
>
>> Arguably, making the wq freezable is kind of a sledgehammer approach to that
>> particular issue, but in principle it may prevent data corruption from
>> occurring, so be careful there.
>
> I tried to find the commit that introduced the "freezing" and I think it
> is this one:
>
> 03ba3782e8dc writeback: switch to per-bdi threads for flushing data
>
> Unfortunately from that commit it is not clear (at least to me) why it
> calls set_freezable() for the bdi task. It does not look like it has
> anything to do with blocking writes to storage while entering
> hibernation but I may be mistaken.

Wow, a decade ago...

Honestly, I don't recall why these were marked freezable, and as I wrote
in the other reply, I don't think there's a good reason for that to be
the case.

--
Jens Axboe