Re: [linus:master] [kbuild] 3bc753c06d: xfstests.generic.454.fail
From: Jason A. Donenfeld
Date: Thu Dec 29 2022 - 21:13:44 EST
On Thu, Dec 29, 2022 at 10:55:05AM -0800, Linus Torvalds wrote:
> Also, I'm surprised this hasn't been an issue earlier - 'char' has
> always been unsigned on arm (among other architectures), so if this
> test started failing now on x86-64 due to -funsigned-char, it has
> presumably been failing on arm the whole time.
That's the curious part, indeed...
> Oh, I think I see one potential problem in ext4:
>
> ext4_xattr_hash_entry() is hot garbage. Lookie here:
>
> while (name_len--) {
> hash = (hash << NAME_HASH_SHIFT) ^
> (hash >> (8*sizeof(hash) - NAME_HASH_SHIFT)) ^
> *name++;
> }
>
> so that hash will now depend on the sign of that 'char *name' pointer.
>
> If that hash ever has any long-term meaning (ie saved on disk or
> exposed some other way), that would be problematic.
Note that ext4 has lots of sign-specific code for hashing. Only some of
it can now be removed, since compatibility with old file systems must be
preserved. But what I mean is the code that begins in super.c:
i = le32_to_cpu(es->s_flags);
if (i & EXT2_FLAGS_UNSIGNED_HASH)
sbi->s_hash_unsigned = 3;
else if ((i & EXT2_FLAGS_SIGNED_HASH) == 0) {
#ifdef __CHAR_UNSIGNED__
if (!sb_rdonly(sb))
es->s_flags |=
cpu_to_le32(EXT2_FLAGS_UNSIGNED_HASH);
sbi->s_hash_unsigned = 3;
#else
if (!sb_rdonly(sb))
es->s_flags |=
cpu_to_le32(EXT2_FLAGS_SIGNED_HASH);
#endif
}
The second part of that #else can now go away. And then maybe the whole
expression can be simplified.
These actually wind up being used in namei.c:
hinfo->hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;
, which then sets the hash version that's selected in hash.c:
switch (hinfo->hash_version) {
case DX_HASH_LEGACY_UNSIGNED:
hash = dx_hack_hash_unsigned(name, len);
break;
case DX_HASH_LEGACY:
hash = dx_hack_hash_signed(name, len);
break;
case DX_HASH_HALF_MD4_UNSIGNED:
str2hashbuf = str2hashbuf_unsigned;
fallthrough;
case DX_HASH_HALF_MD4:
p = name;
[...]
And so on. dx_hack_hash_unsigned() and dx_hack_hash_signed() are the
same functions, except one uses `unsigned char` and the other uses
`signed char`. It's unfortunate these exist, but now it's part of the
on-disk format, so they have to stick around (along with other warts
like "halfmd4").
But at least for new file systems, things should be unified. Anyway, it
looks like for *these* hashes, the ext4 developers did consider the
signedness issue.
Sounds like maybe it was left out of ext4_xattr_hash_entry(), which does
indeed look like it's part of the on-disk representation:
static int
ext4_xattr_inode_verify_hashes(struct inode *ea_inode,
struct ext4_xattr_entry *entry, void *buffer,
size_t size)
{
u32 hash;
/* Verify stored hash matches calculated hash. */
hash = ext4_xattr_inode_hash(EXT4_SB(ea_inode->i_sb), buffer, size);
if (hash != ext4_xattr_inode_get_hash(ea_inode))
return -EFSCORRUPTED;
if (entry) {
__le32 e_hash, tmp_data;
/* Verify entry hash. */
tmp_data = cpu_to_le32(hash);
e_hash = ext4_xattr_hash_entry(entry->e_name, entry->e_name_len,
&tmp_data, 1);
if (e_hash != entry->e_hash)
return -EFSCORRUPTED;
}
return 0;
}
So if ext4_xattr_hash_entry() is indeed broken with respect to
heisensignness, then this stuff has always been broken, and it's
probably good that this is being unearthed...
Jason