Re: [PATCH v2] mm: Enable suspend-only swap spaces
From: Evan Green
Date: Mon Jul 12 2021 - 17:32:46 EST
On Mon, Jul 12, 2021 at 12:03 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> [Cc linux-api]
>
> On Fri 09-07-21 10:50:48, Evan Green wrote:
> > Currently it's not possible to enable hibernation without also enabling
> > generic swap for a given swap area. These two use cases are not the
> > same. For example there may be users who want to enable hibernation,
> > but whose drives don't have the write endurance for generic swap
> > activities.
> >
> > Add a new SWAP_FLAG_NOSWAP that adds a swap region but refuses to allow
> > generic swapping to it. This region can still be wired up for use in
> > suspend-to-disk activities, but will never have regular pages swapped to
> > it.
>
> Could you expand some more on why a strict exclusion is really
> necessary? I do understand that one might not want to have swap storage
> available all the time but considering that swapon is really a light
> operation so something like the following should be a reasonable
> workaround, no?
> swapon storage/file
> s2disk
> swapoff storage
Broadly, it seemed like a reasonable thing for the kernel to be able
to do. The workaround you suggest does work for some use cases, but it
seems like a gap the kernel could more naturally fill.
Without getting too off into the weeds, there a handful of factors
that make this change particularly useful to me:
* Slicing off part of your SSD to be SLC (single level cell) is
expensive. From what I understand you gain endurance and speed at the
cost of 3-4x capacity. In other words for every 1GB of SLC space you
need for swap, it costs you 3-4GB of storage space out of the primary
namespace. So I'm incentivized to size this region as small as
possible. Hibernate's speed/endurance requirements are not quite as
harsh as regular swap. Steering them separately gives me the ability
to put the hibernate image in regular storage, and not be forced to
oversize expensive/fast swap space.
* Even with the workaround, swap can end up in the hibernate region.
Hibernate starts by allocating its giant 50%-of-memory region, which
is often the forcing function for pushing things into swap. With the
workaround, even if my hibernate region is in last priority, there's
still a reasonable chance I'll end up swapping into it. If I have
different security designs for swap space and hibernate, then even a
chance of some swap leaking into this region is a problem.
* I also want to limit the online attack surface that swap presents.
I can make headway here by disallowing open() calls on active swap
regions (via an LSM), and permanently disabling swapon/swapoff system
calls after early init. The workaround isn't great for me because I
want to set everything up at early init time and then not touch it. By
suspend time, on my system I no longer have the ability to make
swapon/swapoff calls.
-Evan