Re: Back to the future.
From: Linus Torvalds
Date: Thu Apr 26 2007 - 12:57:35 EST
On Thu, 26 Apr 2007, Nigel Cunningham wrote:
>
> * Doing things in the right order? (Prepare the image, then do the
> atomic copy, then save).
I'd actually like to discuss this a bit..
I'm obviously not a huge fan of the whole user/kernel level split and
interfaces, but I actually do think that there is *one* split that makes
sense:
- generate the (whole) snapshot image entirely inside the kernel
- do nothing else (ie no IO at all), and just export it as a single image
to user space (literally just mapping the pages into user space).
*one* interface. None of the "pretty UI update" crap. Just a single
system call:
void *snapshot_system(u32 *size);
which will map in the snapshot, return the mapped address and the size
(and if you want to support snapshots > 4GB, be my guest, but I suspect
you're actually *better* off just admitting that if you cannot shrink
the snapshot to less than 32 bits, it's not worth doing)
User space gets a fully running system, with that one process having that
one image mapped into its address space. It can then compress/write/do
whatever to that snapshot.
You need one other system call, of course, which is
int resume_snapshot(void *snapshot, u32 size);
and for testing, you should be able to basically do
u32 size;
void *buffer = snapshot_system(&size);
if (buffer != MAP_FAILED)
resume_snapshot(buffer, size);
and it should obviously work.
And btw, the device model changes are a big part of this. Because I don't
think it's even remotely debuggable with the full suspend/resume of the
devices being part of generating the image! That freeze/snapshot/unfreeze
sequence is likely a lot more debuggable, if only because freeze/unfreeze
is actually a no-op for most devices, and snapshotting is trivial too.
Once you have that snapshot image in user space you can do anything you
want. And again: you'd hav a fully working system: not any degradation
*at*all*. If you're in X, then X will continue running etc even after the
snapshotting, although obviously the snapshotting will have tried to page
a lot of stuff out in order to make the snapshot smaller, so you'll likely
be crawling.
> * Mulithreaded I/O (might as well use multiple cores to compress the
> image, now that we're hotplugging later).
> * Support for > 1 swap device.
> * Support for ordinary files.
> * Full image option.
> * Modular design?
I'd really suggest _just_ the "full image". Nothing else is probably ever
worth supporting. Your "snapshot to disk" wouldn't be _quite_ as simple as
"echo disk > /sys/power/state", but it should not necessarily be much
worse than
snapshot_kernel | gzip -9 > /dev/snapshot
either (and resuming from the snapshot would just be the reverse)!
And if you want to send the snapshot over a TCP connection to another
host, be my guest. With pretty images while it's transferring. Whatever.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/