Re: [RFC] jbd2 metadata checksumming
From: Andreas Dilger
Date: Fri Sep 23 2011 - 20:08:01 EST
On 2011-09-23, at 4:51 PM, Darrick J. Wong wrote:
> While I'm working on adding metadata checksumming to ext4, I figured that I
> ought to look into the similar feature in jbd2. At first I thought I'd simply
> change the default crc algorithm to crc32c and update the field in the commit
> block, but then it was suggested to me that I move that field into the journal
> superblock so that during recovery we don't have to scan ahead through the
> transaction to find the commit block so that we can learn the algorithm type.
>
> Doing that seems to require a format change to the superblock to add that
> field. I think that adding the crc-type field to the superblock is a rocompat
> change since we're not changing existing fields, just adding fields. It looks
> like the kernel and e2fsprogs code both reject a journal if they find unknown
> rocompat bits set. (Using a journal in ro mode is not useful.)
The question is whether the "rejected journal" means that it is ignored
during recovery and not replayed at all, or if it prevents mounting? If it
is ignored and not replayed, but mount continues, that would lead to filesystem
corruption, very bad.
If it prevents mounting, and needs an updated kernel and/or e2fsprogs to
clear (presumably the kernel will not enable this itself unless told to
do so by EXT4_FEATURE_INCOMPAT_METADATA_CSUM), that is not so bad, and
will still allow downgrading to an older kernel as long as the journal is
replayed.
> I decided to dig deeper to see what exactly the journal checksum covers. It
> appears to me that the superblock, revocation blocks, and commit blocks are not
> covered by a checksum. Revocation blocks ought to be checksummed because a
> lost write involving the second sector of a suitably large revocation block
> could result in the wrong blocks being skipped during recovery. It seems like
> it would be easy to extend the current journal_checksum feature to cover the
> commit block, and adding a checksum to the superblock seems trivial.
>
> Lastly, if I'm already making change, I might as well bake the journal UUID
> into the checksum as well. The transaction ID is already in each metadata
> block by virtue of the common block header.
>
> So to summarize, I propose:
>
> 1. Adding a JBD2_FEATURE_ROCOMPAT_CHECKSUM_V2 field, which provides:
> 2. A u8 field at offset 0x50 in the superblock which identifies the checksum
> algorithm that's in use;
> 3. A u32 field at offset 0x54 in the superblock to hold the superblock's
> checksum;
Why not put it at the end of the superblock, so that it can cover the whole
thing?
> 4. Changing the revocation block code to put a checksum in the 4 bytes
> following the revocation data, and to ensure those 4 bytes always exist;
It would be easier to see the changes if you included the structs.
> 5. Adding the journal UUID to each checksum computation;
> 6. Extend the commit checksum to cover the commit block itself, with the commit
> block checksum field zeroed during the computation, of course;
> 7. Changing the default algorithm to crc32c; and
> 8. Updating ext4 to enable both checksum fields at journal load time, if the
> user supplies the journal_checksum mount option.
Probably this should also be conditional on the ext4 code using the
EXT4_FEATURE_INCOMPAT_METADATA_CSUM, so that we know the kernel will
be able to recover, and the user has explicitly requested this.
There is a mechanism for the ext4 code to pass features to the jbd2 code
already, so this shouldn't be a problem.
Cheers, Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/