Re: [dm-devel] [RFC] dm-bow working prototype

From: Paul Lawrence
Date: Thu Oct 25 2018 - 13:23:59 EST


Thank you for the suggestion. I spent part of yesterday experimenting with this idea, and it is certainly very promising. However, it does have some disadvantages as compared to dm-bow, if I am understanding the setup correctly:

1) Since dm-snap has no concept of the free space on the underlying file system any write into the free space will trigger a backup, so using twice the space of dm-bow. Changing existing data will create a backup with both drivers, but since we have to reserve the space for the backups up-front with dm-snap, we would likely only have half the space for that. Either way, it seems that dm-bow is likely to double the amount of changes we could make.

(Might it be possible to dynamically resize the backup file if it is mostly used up? This would fix the problem of only having half the space for changing existing data. The documentation seems to indicate that you can increase the size of the snapshot partition, and it seems like it should be possible to grow the underlying file without triggering a lot of writes. OTOH this would have to happen in userspace which creates other issues.)

2) Similarly, since writes into free space do not trigger a backup in dm-bow, dm-bow is likely to have a lower performance overhead in many circumstances. On the flip side, dm-bow's backup is in free space and will collide with other writes, so this advantage will reduce as free space fills up. But by choosing a suitable algorithm for how we use free space we might be able to retain most of this advantage.

I intend to put together a fully working prototype of your suggestion next to better compare with dm-bow. But I do believe there is value in tracking free space and utilizing it in any such solution.


On 10/24/2018 12:24 PM, Mikulas Patocka wrote:

On Wed, 24 Oct 2018, Paul Lawrence wrote:

Android has had the concept of A/B updates for since Android N, which means
that if an update is unable to boot for any reason three times, we revert to
the older system. However, if the failure occurs after the new system has
started modifying userdata, we will be attempting to start an older system
with a newer userdata, which is an unsupported state. Thus to make A/B able to
fully deliver on its promise of safe updates, we need to be able to revert
userdata in the event of a failure.

For those cases where the file system on userdata supports
snapshots/checkpoints, we should clearly use them. However, there are many
Android devices using filesystems that do not support checkpoints, so we need
a generic solution. Here we had two options. One was to use overlayfs to
manage the changes, then on merge have a script that copies the files to the
underlying fs. This was rejected on the grounds of compatibility concerns and
managing the merge through reboots, though it is definitely a plausible
strategy. The second was to work at the block layer.

At the block layer, dm-snap would have given us a ready-made solution, except
that there is no sufficiently large spare partition on Android devices. But in
general there is free space on userdata, just scattered over the device, and
of course likely to get modified as soon as userdata is written to. We also
decided that the merge phase was a high risk component of any design. Since
the normal path is that the update succeeds, we anticipate merges happening
99% of the time, and we want to guarantee their success even in the event of
unexpected failure during the merge. Thus we decided we preferred a strategy
where the device is in the committed state at all times, and rollback requires
work, to one where the device remains in the original state but the merge is
complex.
What about allocating a big file, using the FIEMAP ioctl to find the
physical locations of the file, creating a dm device with many linear
targets to map the big file and using it as a snapshot store? I think it
would be way easier than re-implementing the snapshot functionality in a
new target.

You can mount the whole filesystem using the "origin" target and you can
attach a "snapshot" target that uses the mapped big file as its snapshot
store - all writes will be placed directly to the device and the old data
will be copied to the snapshot store in the big file.

If you decide that rollback is no longer needed, you just unload the
snapshot target and delete the big file. If you decide that you want to
rollback, you can use the snapshot merge functionality (or you can write a
userspace utility that does offline merge).

Mikulas