Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

From: Theodore Y. Ts'o
Date: Mon Oct 05 2020 - 20:05:16 EST


On Mon, Oct 05, 2020 at 10:36:39AM -0700, Darrick J. Wong wrote:
> > Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in
> > ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems
> > with intentionally overlapping bitmap blocks.
> >
> > On an always-read-only filesystem explicitly marked with
> > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to
> > point all the block and inode bitmaps to a single block
>
> LOL, WHAT?
>
> I didn't know shared blocks applied to fs metadata. I thought that
> "shared" only applied to file extent maps being able to share physical
> bloctks.

My understanding matches Darrick's. I was going to track down the
Google engineer who has most recently (as far as I know) enhanced
e2fsprogs's support of the shared block feature (see the commits
returned by "git log --author dvander@xxxxxxxxxx contrib/android") but
he's apparently out of the office today. Hopefully I'll be able to
track him down and ask about this tomorrow.

> Oookay. So that's not how you create these shared block ext4s,
> apparently...

Yeah, they are created by the e2fsdroid program. See sources in
contrib/e2fsdroid. I took a quick look, and I don't see anything
there which is frobbing with with the bitmaps; but perhaps I'm missing
something, which is why I'd ask David to see if he knows anything
about this.

More to the point, if we are have someone who is trying to dedup or
otherwise frob with bitmaps, I suspect this will break "e2fsck -E
unshare_blocks /dev/XXX", which is a way that you can take a root file
system which is using shared_blocks, and turn it into something that
can actually be mount read/write. This is something that I believe
was being used by AOSP "debug" or "userdebug" (I'm a bit fuzzy on the
details) so that Android developers couldn't actually modify the root
file system. (Of course, you have to also disable dm-verity in order
for this to work....)

Unfortunately, e2fsdroid is currently not buildable under the standard
Linux compilation environment. For the reason why, see:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=928551#75

The first step would be to teach e2fsprogs's configure to check for
libsparse, and to link against it if it's available. But before we
could enable this by default for Linux distribution, we need to link
against libsparse using dlopen(), since most distro release engineers
would be.... cranky.... if mke2fs were to drag in some random Android
libraries that have to be installed as shared libraries in their
installers. Which is the point of comment #75 in the above bug.

Since the only use of shared_blocks is for Android, since very few
other projects want a completely read-only root file system, and where
dedup is actually significantly helpful, we've never tried to make
this work outside of the Android context. At least in theory, though,
it might be useful if we could create shared_block file systems using
"mke2fs -O shared_blocks -d /path/to/embedded-root-fs system.img 1G".
Patches gratefully accepted....

Cheers,

- Ted