Re: [PATCH v2] mm: Enable suspend-only swap spaces

From: David Hildenbrand
Date: Wed Jul 14 2021 - 03:51:23 EST

On 14.07.21 07:43, Michal Hocko wrote:
On Mon 12-07-21 09:41:26, David Hildenbrand wrote:
On 12.07.21 09:03, Michal Hocko wrote:
[Cc linux-api]

On Fri 09-07-21 10:50:48, Evan Green wrote:
Currently it's not possible to enable hibernation without also enabling
generic swap for a given swap area. These two use cases are not the
same. For example there may be users who want to enable hibernation,
but whose drives don't have the write endurance for generic swap

Add a new SWAP_FLAG_NOSWAP that adds a swap region but refuses to allow
generic swapping to it. This region can still be wired up for use in
suspend-to-disk activities, but will never have regular pages swapped to

Could you expand some more on why a strict exclusion is really
necessary? I do understand that one might not want to have swap storage
available all the time but considering that swapon is really a light
operation so something like the following should be a reasonable
workaround, no?
swapon storage/file
swapoff storage

I'm certainly not a hibernation expert, but I'd guess this can also be
triggered by HW events, so from the kernel and not only from user space
where your workaround would apply.

Is there anything preventing such a HW event doing the equivalent of the

Let's take a look at hibernate() callers:

drivers/mfd/tps65010.c: calls hibernate() from IRQ contex, based on HW
kernel/power/autosleep.c: calls hibernate() when it thinks it might be
a good time to go to sleep
kernel/power/main.c: calls hibernate() triggered by userspace
kernel/reboot.c: calls hibernate() triggered by userspace

So on two paths, hibernate() is not under user space control and the sequence you propose does not apply. The kernel itself has no idea which swap space to activate before hibernating, that's a user space decision. And without this patch, user space cannot communicate that decision to the kernel without eventually also swapping.

However, I assume in most cases (e.g., ACPI events, power button pressed, ...) we always notify user space, which in return decides which action to take. Doing it under kernel control is more of a corner case I guess, so I do wonder if we really care about these setups.

Anyhow, the proposal here does not sound completely crazy to me, although it's unfortunate how we decided to mangle hibernation and swapping into the same mechanism originally; a different interface to active "hibernation only backends" would be cleaner than doing a "swapon ..." without swapping. However, that will require much more work and I am not sure if it's worth it ...


David / dhildenb