Re: [PATCH v3] ext4: Fix rec_len verify error
From: Andreas Dilger
Date: Thu Aug 03 2023 - 18:34:10 EST
On Aug 2, 2023, at 9:09 PM, Darrick J. Wong <djwong@xxxxxxxxxx> wrote:
>
> On Thu, Aug 03, 2023 at 09:52:53AM +0800, Stephen Zhang wrote:
>> Andreas Dilger <adilger@xxxxxxxxx> 于2023年8月2日周三 14:07写道:
>>>
>>> Not all of these cases are actual bugs. The ext4_rec_len_from_disk()
>>> function is only different for rec_len >= 2^16, so if it is comparing
>>> rec_len against "12" or "sizeof(struct ...)" then the inequality will
>>> be correct regardless of how it is decoded.
>>>
>>> That said, it makes sense to use ext4_rec_len_from_disk() to access
>>> rec_len consistently throughout the code, since that avoids potential
>>> bugs in the future. We know the code will eventually will be copied
>>> some place where rec_len >= 2^16 is actually important, and we may as
>>> well avoid that bug before it happens.
>>>
>>>
>>> One thing this discussion *does* expose is that ext4_rec_len_from_disk()
>>> is hard-coded at compile time to differentiate between PAGE_SIZE > 64k
>>> and PAGE_SIZE = 4K, because it was never possible to have blocksize >
>>> PAGE_SIZE, so only ARM/PPC ever had filesystems with blocksize=64KiB
>>> (and the Fujitsu Fugaku SPARC system with blocksize=256KiB).
>>>
>>> However, with the recent advent of the VM and IO layers allowing
>>> blocksize > PAGE_SIZE this function will need to be changed to allow
>>> the same on x86 PAGE_SIZE=4KiB systems. Instead of checking
>>>
>>> #if PAGE_SIZE >= 65536
>>>
>>> it should handle this based on the filesystem blocksize at runtime:
>>>
>>> static inline
>>> unsigned int ext4_rec_len_from_disk(__le16 dlen, unsigned blocksize)
>>> {
>>> unsigned len = le16_to_cpu(dlen);
>>>
>>> if (blocksize < 65536)
>>> return len;
>>>
>>> if (len == EXT4_MAX_REC_LEN || len == 0)
>>> return blocksize;
>>>
>>> return (len & 65532) | ((len & 3) << 16);
>>> }
>>>
>>> Strictly speaking, ((len & 65532) | ((len & 3) << 16) should equal "len"
>>> for any filesystem with blocksize < 65536, but IMHO it is more clear if
>>> the code is written this way.
>>>
>>> Similarly, the encoding needs to be changed to handle large records at
>>> runtime for when we eventually allow ext4 with blocksize > PAGE_SIZE.
>>>
>>> static inline __le16 ext4_rec_len_to_disk(unsigned len, unsigned blocksize)
>>> {
>>> BUG_ON(len > blocksize);
>>> BUG_ON(blocksize > (1 << 18));
>>> BUG_ON(len & 3);
>>>
>>> if (len < 65536) /* always true for blocksize < 65536 */
>>> return cpu_to_le16(len);
>>>
>>> if (len == blocksize) {
>>> if (blocksize == 65536)
>>> return cpu_to_le16(EXT4_MAX_REC_LEN);
>>>
>>> return cpu_to_le16(0);
>>> }
>>>
>>> return cpu_to_le16((len & 65532) | ((len >> 16) & 3));
>>> }
>>>
>>
>> Hmm, at least it sounds reasonable to me based on my limited
>> knowledge. However, I am not sure whether you want me to incorporate
>> these changes into this particular commit or another patch within this
>> submission.
>>
>> By default, I will simply leave it for further discussion. Please let
>> me know if you have any ideas.
>
> ext4 doesn't support blocksize > PAGE_SIZE yet. Don't worry about this
> for now.
I agree it doesn't need to be merged into the current patch.
It's something that could be fixed in a follow-on patch, to have one less
bug to fix in the future when ext4 *does* support blocksize > PAGE_SIZE,
which isn't so far away anymore.
Cheers, Andreas
Attachment:
signature.asc
Description: Message signed with OpenPGP