Re: [PATCH v3] ext4: Fix rec_len verify error

From: Darrick J. Wong
Date: Tue Aug 01 2023 - 11:18:39 EST


On Tue, Aug 01, 2023 at 07:23:37PM +0800, zhangshida wrote:
> From: Shida Zhang <zhangshida@xxxxxxxxxx>
>
> With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
> a problem occurred when more than 13 million files were directly created
> under a directory:
>
> EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
> EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
>
> When enough files are created, the fake_dirent->reclen will be 0xffff.
> it doesn't equal to the blocksize 65536, i.e. 0x10000.
>
> But it is not the same condition when blocksize equals to 4k.
> when enough files are created, the fake_dirent->reclen will be 0x1000.
> it equals to the blocksize 4k, i.e. 0x1000.
>
> The problem seems to be related to the limitation of the 16-bit field
> when the blocksize is set to 64k.
> To address this, helpers like ext4_rec_len_{from,to}_disk has already
> been introduce to complete the conversion between the encoded and the
> plain form of rec_len.
>
> So fix this one by using the helper, and all the other
> le16_to_cpu(ext4_dir_entry{,_2}.rec_len) accesses in this file too.
>
> Cc: stable@xxxxxxxxxx
> Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes")
> Suggested-by: Andreas Dilger <adilger@xxxxxxxxx>
> Suggested-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> Signed-off-by: Shida Zhang <zhangshida@xxxxxxxxxx>
> ---
> v1->v2:
> Use the existing helper to covert the rec_len, as suggested by Andreas.
> v2->v3:
> 1,Covert all the other rec_len if necessary, as suggested by Darrick.
> 2,Rephrase the commit message.
>
> fs/ext4/namei.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 0caf6c730ce3..8cb377b8ad86 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -346,14 +346,14 @@ static struct ext4_dir_entry_tail *get_dirent_tail(struct inode *inode,
>
> #ifdef PARANOID
> struct ext4_dir_entry *d, *top;
> + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
>
> d = (struct ext4_dir_entry *)bh->b_data;
> top = (struct ext4_dir_entry *)(bh->b_data +
> - (EXT4_BLOCK_SIZE(inode->i_sb) -
> - sizeof(struct ext4_dir_entry_tail)));
> - while (d < top && d->rec_len)
> + (blocksize - sizeof(struct ext4_dir_entry_tail)));
> + while (d < top && ext4_rec_len_from_disk(d->rec_len, blocksize))
> d = (struct ext4_dir_entry *)(((void *)d) +
> - le16_to_cpu(d->rec_len));
> + ext4_rec_len_from_disk(d->rec_len, blocksize));
>
> if (d != top)
> return NULL;

This is sitll missing some pieces; what about this clause at line 367:

if (t->det_reserved_zero1 ||
le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) ||
t->det_reserved_zero2 ||
t->det_reserved_ft != EXT4_FT_DIR_CSUM)
return NULL;

> @@ -445,13 +445,13 @@ static struct dx_countlimit *get_dx_countlimit(struct inode *inode,
> struct ext4_dir_entry *dp;
> struct dx_root_info *root;
> int count_offset;
> + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
>
> - if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
> + if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == blocksize)
> count_offset = 8;
> - else if (le16_to_cpu(dirent->rec_len) == 12) {
> + else if (ext4_rec_len_from_disk(dirent->rec_len, blocksize) == 12) {

Why not lift this ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ to a
local variable? @dirent doesn't change, right?

> dp = (struct ext4_dir_entry *)(((void *)dirent) + 12);
> - if (le16_to_cpu(dp->rec_len) !=
> - EXT4_BLOCK_SIZE(inode->i_sb) - 12)
> + if (ext4_rec_len_from_disk(dp->rec_len, blocksize) != blocksize - 12)
> return NULL;
> root = (struct dx_root_info *)(((void *)dp + 12));
> if (root->reserved_zero ||

What about dx_make_map?

Here's all the opencoded access I could find:

$ git grep le16.*rec_len fs/ext4/
fs/ext4/namei.c:356: le16_to_cpu(d->rec_len));
fs/ext4/namei.c:367: le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) ||
fs/ext4/namei.c:449: if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb))
fs/ext4/namei.c:451: else if (le16_to_cpu(dirent->rec_len) == 12) {
fs/ext4/namei.c:453: if (le16_to_cpu(dp->rec_len) !=
fs/ext4/namei.c:1338: map_tail->size = le16_to_cpu(de->rec_len);

--D

> --
> 2.27.0
>