Re: Reasons to merge suspend2.
From: Al Boldi
Date: Wed Apr 25 2007 - 08:48:12 EST
I didn't read your whole post, it's way too long, but I would like to see
your patch in mainline as an option to swsusp. What would make this
infeasible?
Thanks!
--
Al
---
Nigel Cunningham wrote:
> Hi all.
>
> I've been working on this email on and off for a while, but since Pavel
> raised the issue again, I thought I should make a concerted effort to
> finish it...
>
> In this email, I'm going to outline the problems with the current design
> (uswsusp and swsusp) and the ways in which Suspend2 overcomes those
> limitations, before going on to outline the additional advantages
> Suspend2 has for users and address objections previously raised against
> merging Suspend2.
>
> A) Problems with the current design.
> ====================================
> 1) Ordering of operations.
>
> The current [u]swsusp design doesn't do things in discrete, well ordered
> stages. Storage for the image is not allocated until after the atomic
> copy has been done. This means that the process can fail when we are a
> significant portion of the way into suspending, and it means it can fail
> when the user will seriously expect it to run to completion. The
> solution to this issue is simple: separate preparing to suspend from
> actually writing the image. In the preparation step, ensure, so far as
> you are able, that there will be sufficient memory and sufficient
> storage to complete the process, and don't write anything or do any
> atomic copying until after that has been done.
>
> The only valid objection I can think of is that you can't know for
> certain prior to doing the atomic copy how much memory & storage will be
> needed for allocations by driver suspend methods. That can be addressed
> by a simple extension of the driver model, where in drivers could report
> how many pages they will need. (If slab will be needed, the worst case
> can be assumed). Rafael's notify patches (recently posted) also help in
> that area.
>
> Once processes are frozen, all significant memory usage can be accounted
> for, because the process doing the suspending will be the only one
> allocating memory.
>
> 2) Limit on image size.
>
> The current implementation limits the size of an image to an absolute
> maximum of half the amount of ram. This is certainly an improvement over
> the old days where it sought to free everything it could, but it's still
> not good enough. Current memory freeing code doesn't free the exact
> amount requested; often far more than has been requested is freed. This
> does not only result in a smaller image. It also means the system is
> proportionately less responsive on resume at whatever stage that those
> pages are needed again. A full image is certainly not needed by
> everyone. Those with huge amounts of memory, very fast storage devices
> or particular memory usage patterns may, quite rightly, not want to
> store the whole lot in an image. This doesn't mean, however, that those
> who want or need (from their perspective) a full image of memory
> shouldn't be able to have it. It just adds to the argument for making it
> tunable (which swsusp has done too).
>
> 3) Lack of provision for tuning to individual needs.
>
> Swsusp historically included very little provision whatsoever for the
> user to tune their configuration. This has recently begun to change, and
> I applaud that. But it needs to go further. Suspending to disk is not a
> one-size-fits-all situation. People have different hardware
> configurations, with the result being that some people benefit from
> compression while others do better without it. Some people want
> encryption in a particular configuration while others don't care about
> encryption at all. Some people want to limit the image size, others
> don't. Sometimes a user might want to reboot instead of powering down
> (dual booting). All of this should be doable, without having to hack the
> code or recompile the kernel, and should be as simple as possible.
> Suspend2, via its /sys/power/suspend2 interface and hibernate-script
> porcelain, makes this easy.
>
> 4) No support for multiple swap devices / non swap storage.
>
> Until recently, [u]swsusp supported a single swap partition only.
> Support for a swap file has been added, but [u]swsusp still supports
> only one swap device at a time. For most people, this is adequate, but
> this doesn't mean everyone should be forced to fit this mould.
>
> [u]swsusp also lacks support for storage to non-swap. Particularly in
> systems that rely on swap for normal activity, this can make [u]swsusp
> less reliable. The amount of swap available varies according to
> workload, so sometimes the user will be unable to suspend. To address
> this raciness/competition against other swap usage, Suspend2 supports
> writing to a generic file, either a partition or a file on an ordinary
> partition.
>
> B) Further advantages of Suspend2.
> ==================================
> 1) Improvements over swsusp.
> ----------------------------
> a) Modular design.
>
> Parts of Suspend2 implement support for storing an image in swap or in a
> file, using cryptoapi for compression and/or encryption and talking to a
> userspace user interface via a netlink socket. Suspend2 works just fine
> without CONFIG_SWAP, CONFIG_NET and/or CONFIG_CRYPTOAPI, however,
> because it uses a modular design wherein support for these subsystems is
> abstracted (not to be confused with kernel modules). If you disable swap
> support, for example, one file is simply not built. The number of
> #ifdefs in Suspend2 is thus minimal.
>
> In addition, the modular design made modifications such as switching
> from internal compression and encryption support to cryptoapi simple and
> painless. All of the required modifications were found in compression.c,
> encryption.c and Kconfig in kernel/power. The old and new
> implementations could even co-exist if so desired. I recently dropped
> encryption support (after deciding the existing support in block dev
> drivers was more than adequate). This took five minutes tops - remove
> the .c and modify the Makefile and Kconfig.
>
> The modular design also helps with implementing the user interface. Each
> module gets its own subdirectory in /sys/power/suspend2, so the top
> level directory is not cluttered and it's easier to find what you're
> after. Switching from /proc/suspend2 to /sys/power/suspend2 required
> modifications to just two main routines (one for reading and one for
> writing entries).
>
> b) Compression support.
>
> Swsusp has no support for compressing an image. Suspend2 has optional
> cryptoapi based support for compressiing the image, and includes a patch
> to add an LZF based compressor to cryptoapi. When this support is used,
> the speed of reading (and to a lesser extent writing) the image is
> generally in the region of being doubled.
>
> c) Optional image size limit.
>
> Suspend2 also implements an optional, user specified soft limit on the
> image size. If set to a positive value, it is interpreted as a number of
> megabytes and Suspend2 attempts to free memory to keep the image size
> within this limit, but won't abort the cycle if this limit isn't met. If
> set to -1, Suspend2 will refuse to free any memory, and will abort if
> other criteria for suspending aren't satisfied. If set to -2, it will
> drop filesystem caches (equivalent to echo 1 > /proc/sys/vm/drop_caches)
> prior to suspending, but will not otherwise eat memory unless necessary.
> d) Cryptoapi based compression.
>
> Suspend2 uses cryptoapi for compression. Swsusp includes no built in
> support for compression.
>
> 2) Improvements over uswsusp.
> -----------------------------
> a) Simpler to set up.
>
> The heart of Suspend2 is implemented in the kernel so, unlike uswsusp,
> there is no need for the user to download and install userspace
> libraries, build a userspace app and figure out how to create and update
> an initrd or initramfs. In most situations, it just works. (The
> exception is LVM and such like, where both implementations require
> userspace apps to set up access to the logical volumes (or encrypted
> volumes) before they can be used for resuming).
>
> b) No unnecessary copying of data.
>
> uswsusp copies the image to userspace and back again. It may compress
> the data in userspace. But none of this is necessary. There is a
> perfectly good compression and encryption library in the form of
> cryptoapi already in the kernel. Suspend2 uses this. uswsusp could too.
>
> c) API changes far less critical.
>
> Modifications to the API between kernel and userspace can cause big
> headaches for uswsusp (see, eg, the recent issue with running a 32 bit
> suspend program on a 64 bit kernel, recently raised by Johannes Berg on
> the linux-pm mailing list).
>
> In Suspend2's case, userspace programs only handle the user interface.
> If an API mismatch does occur, the issue will not void the user's
> ability to suspend or resume.
>
> 3) Completely New Functionality/Improvements.
> ---------------------------------------------
> a) Filewriter.
>
> Using swap to store the image is inherently racy. To be able to suspend,
> we need enough free memory and enough free storage. But getting enough
> free memory might involve swapping out some memory, which reduces the
> amount of available storage, which might require more free memory.
>
> It is true that most of the time this race isn't an issue. Nevertheless,
> that's the nature of races.
>
> Suspend2 implements support for files as a means of avoiding this issue.
> Thus, it is much more reliable in low memory situations than swsusp or
> uswsusp.
>
> b) Multiple swap devices.
>
> Suspend2 supports writing an image to multiple swap devices, whereas
> uswsusp and swsusp only write to one device.
>
> c) Full image of memory.
>
> Suspend2 implements support for writing a full image of memory. You thus
> get a more responsive system post-resume; just as responsive as if you'd
> never suspended. This support can be disabled via a sysfs entry
> (no_pageset2).
>
> d) Keep image mode.
>
> Suspend2 supports keeping the image after resuming. This is used in
> kiosk systems where nothing is written to the filesystem or changes are
> written to a separate filesystem that is mounted after resume and
> unmounted before suspending or powering off.
>
> e) Ability to cancel a cycle.
>
> Suspend2 allows the user to cancel a cycle (and this ability can be
> disabled). This means you don't have to wait for the system to finish
> suspending, then resume it to get your system back. If done prior to the
> atomic copy, you have it back instantly. If afterwards, a small portion
> of the image is read first.
>
> f) Scripting support.
>
> Suspend2 allows scripts to check whether an image exists
> (cat /sys/power/suspend2/have_image), remove one (echo 0 > have_image),
> and set the location of the image header (echo /dev/hda1 > resume2). One
> user utilises this support to provide an initrd/ramfs based menu of
> previously suspended live-cd images. This could also be used in a lab
> environment with homogeneous computer specifications to allow resuming
> to a login screen, then resuming the image of a user's previous session
> once they have logged in.
> g) Userspace user interface.
>
> Suspend2 provides userspace based user interface programs that
> communicate with the core code via a netlink socket. This allows the
> user to have all the eyecandy they want (although it might slow
> suspending!), without the code needing to run in kernelspace or
> compromise the integrity of the image.
>
> h) Early messages.
>
> Suspend2 provides user-friendly handling of error conditions early in
> the boot process. Sanity checks on the image are done before loading it,
> and if it looks like the user has (for example) accidentally booted the
> wrong kernel, Suspend2 will warn them and allow them to reboot into the
> right kernel, or invalidate the image and carry on booting. This has a
> 25 second timeout and sensible default, so the kernel will not hang
> forever.
>
> i) Powerdown methods.
>
> Suspend2 supports a greater variety of methods of powering down once the
> image has been written. It can enter ACPI states S3, S4 or S5, use a
> non-ACPI power off or resume an alternate image.
>
> S3 was recently picked up by uswsusp, but isn't supported by swsusp. It
> allows the user to suspend to ram instead of powering down after writing
> the image. If the battery runs out, we resume as if they'd fully powered
> off. If it doesn't, we act like the cycle was cancelled at the last
> moment, reloading a small portion of the image (pages that were
> overwritten by the atomic copy) before giving control back to the user.
>
> The support for resuming an alternate image is primarily useful for a
> lab/multi-distro environment. It has the same limitations regarding
> mounted filesystems that normally apply, but otherwise provides a way to
> switch between images quickly and easily. (One image could be a log-in
> screen/image selection menu, and the other individual users or distros
> sessions).
>
> j) Transparent swsusp replacement.
>
> Suspend2 also implements optional replacement of swsusp. When enabled,
> echo disk > /sys/power/state will activate Suspend2, resume= will
> override resume2= and noresume will also function as noresume2. Finally,
> activating a swsusp resume will also cause Suspend2 to check whether to
> resume (we don't know until we check whether the replacing of swsusp was
> enabled when we suspended or not). A compile time option allows the user
> to enable or disable this functionality by default.
> k) Expected compression ratio.
>
> Suspend2 allows the user to set an expected compression ratio. This
> allows the user to store a larger image than might otherwise be
> possible, particularly in situations where available storage is less
> than the amount of memory in use. Let's imagine, for example, that the
> user has 1GB of RAM and a 600MB swap partition or file. Without an
> expected compression ratio, Suspend2 would always store at most 600MB in
> the image. With an expected compression ratio of 50% (common for LZF),
> Suspend2 will not free memory even if there's the full gigabyte of
> memory in use, because it will assume that the compressed image will fit
> in 500MB.
>
> l) Simpler swap file support.
>
> Suspend2 makes using a swap file much simpler. The user simply needs to
> swapon the file, then cat /sys/power/suspend2/swap/header_locations:
>
> # cat /sys/power/suspend2/swap/headerlocations
> For swap partitions, simply use the format: resume2=swap:/dev/hda1.
> For swapfile `/blot/swapfile`, use resume2=swap:/dev/hda6:0xf4000.
> #
> m) Multithreaded i/o.
>
> With the recent move to doing cpu hotplugging just prior to the atomic
> copy, rather than right at the start of the cycle, the possibility has
> been opened up of using multiple cores to do the image de/compression.
> Suspend2 now includes this. The performance improvement has been
> particularly seen during compression, where the speed on a dual core P4
> came up to the same as seen in reading the image (ie approximately
> double that achieved without compression). This support is disabled by
> default at the moment, while upstream work on interactions between cpu
> hotplugging and freezing are resolved.
>
> 4) Support.
> -----------
> Suspend2 has very active support in mailing lists, a web site, bugzilla
> and wiki. Nigel is not going to refuse to deal with people because their
> kernel is tainted or isn't the latest release.
>
> C) Objections to merging Suspend2.
> ==================================
> 1) Size of the patch.
>
> These objections seem to have been dealt with in this morning's
> discussions already. The only thing I would add is that the Suspend2
> patch size is somewhat inflated by documentation. The 16000 lines quoted
> includes 1100 lines of Changelog and another 1100 of documents
> describing how it works and how to use it.
>
> 2) "It should be done in parts"
> Since we have a modular design, some parts, such as compression and
> support for writing to ordinary files can clearly be handled separately.
>
> A comparison of the core code with that in swsusp would, however, show
> that Suspend2 is far more than just a bolting on of addition features to
> swsusp. Substantial changes in the basic method of operation have been
> made (see esp 1A above) which would make the task far larger and more
> complicated than it needs to be.
>
> While swsusp could, therefore, be mutated into suspend2 over time, I
> believe it is far more straightforward and simple to just merge
> suspend2, let the two coexist for a while and then drop swsusp when
> people are satisfied that suspend2 is an adequate replacement.
>
> A tangential (but important) issue is that I simply don't have the time
> to do the incremental modifications to swsusp.
>
> 3) It's not needed.
>
> It is true that swsusp is perfectly adequate for some people. This
> doesn't, however, mean that it meets the needs of all people.
>
> To put it bluntly, if Suspend2 wasn't needed, I wouldn't be working on
> it. I have more than enough in the way of other things that I'd rather
> be doing, but as a user, I want more than swsusp or uswsusp deliver, so
> I continue to work on Suspend2.
>
> 4) [u]swsusp will/could implement it in the future.
>
> At the last review, Pavel replied to many of the points about Suspend2
> features that swsusp lacks by saying 'uswsusp can do this'. But the
> facts are that uswsusp is very slow to get these new features - the
> previous revision of this paragraph had (and I believe it was accurate)
> "has no new features over swsusp at the moment". Furthermore, it would
> probably not be unreasonable to argue that if Suspend2 didn't have these
> features, uswsusp would never have gotten them.
>
> Hope this helps,
>
> Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/