RE: [PATCH v3] hfsplus: validate btree bitmap during mount and handle corruption gracefully
From: Viacheslav Dubeyko
Date: Mon Feb 16 2026 - 19:42:03 EST
On Sun, 2026-02-15 at 23:27 +0530, Shardul Bankar wrote:
> On Tue, 2026-02-03 at 23:12 +0000, Viacheslav Dubeyko wrote:
> > On Tue, 2026-02-03 at 14:28 +0530, Shardul Bankar wrote:
> > > On Mon, 2026-02-02 at 20:52 +0000, Viacheslav Dubeyko wrote:
> > > > On Sat, 2026-01-31 at 19:34 +0530, Shardul Bankar wrote:
> > > >
> > > > >
> > > > > + len = hfs_brec_lenoff(node,
> > > > > + HFSPLUS_BTREE_HDR_MAP_REC,
> > > > > &bitmap_off);
> > > > > +
> > > > > + if (len != 0 && bitmap_off >= sizeof(struct
> > > > > hfs_bnode_desc)) {
> > > >
> > > > If we read incorrect len and/or bitmap_off, then it sounds like
> > > > corruption too.
> > > > We need to process this issue somehow but you ignore this,
> > > > currently.
> > > > ;)
> >
> > I am not suggesting to check everything. But because you are using
> > these values
> > and you can detect that these value is incorrect, then it makes sense
> > to report
> > the error if something is wrong.
>
> Hi Slava,
>
> Ah, I misunderstood you previously!
>
> You are right. I will add an explicit error log and force SB_RDONLY for
> this invalid offset/length case in v4 as well.
>
> >
> > >
> > > > > + hfs_bnode_read(node, &bitmap_byte, bitmap_off,
> > > > > 1);
> > > >
> > > > I assume that 1 is the size of byte, then sizeof(u8) or
> > > > sizeof(bitmap_byte)
> > > > could look much better than hardcoded value.
> > >
> > > Ack'ed. Changing to sizeof(bitmap_byte).
> > >
> > > >
> > > > > + if (!(bitmap_byte & HFSPLUS_BTREE_NODE0_BIT)) {
> > > >
> > > > Why don't use the test_bit() [1] here? I believe that code will
> > > > be
> > > > more simple
> > > > in such case.
> > >
> > >
> >
> > Correct me if I am wrong, but I suspect that I am right. :) The
> > endianness is
> > about bytes not bits. I am googled this: "Endianness defines the
> > order in which
> > bytes, constituting multibyte data (like 16, 32, or 64-bit integers),
> > are stored
> > in memory or transmitted over a network.". So, if you need manage
> > endianness of
> > of values, then you can use cpu_to_bexx()/bexx_to_cpu() family of
> > methods. But
> > it's not about bits. If you take byte only, then the representation
> > of byte is
> > the same in Big-Endian (BE) or Little-Endian (LE). Am I right here?
> > :)
>
> You are right that a single byte's representation is the same in memory
> regardless of CPU endianness. To be completely transparent, the precise
> interplay between cross-architecture kernel bitops, memory alignment,
> and on-disk format endianness has always been a bit of a conceptual
> blind spot for me.
>
> My initial hesitation with `test_bit()` was also that it strictly
> requires an `unsigned long *`. Reading a 1-byte on-disk requirement
> directly into an `unsigned long` buffer felt sub-optimal and could risk
> out-of-bounds stack reads if not handled carefully.
>
> However, I want to follow your recommendation to standardize on kernel
> bitops. To safely use `test_bit()`, I propose reading the byte safely
> into a `u8`, and then promoting it to an `unsigned long` before
> testing; something like this on top of the patch:
>
> diff --git a/fs/hfsplus/btree.c b/fs/hfsplus/btree.c
> index 4bd0d18ac6c6..7ce1708e423c 100644
> --- a/fs/hfsplus/btree.c
> +++ b/fs/hfsplus/btree.c
> @@ -135,6 +135,7 @@ struct hfs_btree *hfs_btree_open(struct super_block
> *sb, u32 id)
> struct hfs_btree *tree;
> struct hfs_btree_header_rec *head;
> struct address_space *mapping;
> + unsigned long bitmap_word;
> struct hfs_bnode *node;
> u16 len, bitmap_off;
> struct inode *inode;
> @@ -255,7 +256,8 @@ struct hfs_btree *hfs_btree_open(struct super_block
> *sb, u32 id)
>
> if (len != 0 && bitmap_off >= sizeof(struct hfs_bnode_desc)) {
> hfs_bnode_read(node, &bitmap_byte, bitmap_off,
> sizeof(bitmap_byte));
> - if (!(bitmap_byte & HFSPLUS_BTREE_NODE0_BIT)) {
> + bitmap_word = bitmap_byte;
> + if (!test_bit(HFSPLUS_BTREE_NODE0_BIT, &bitmap_word)) {
> const char *tree_name;
OK. I see. I don't think that we need to make the code more complicated. If we
can simply read the bitmap_word and then to use it for test_bit(), then it
sounds like reasonable one. Otherwise, I think we don't need to use some tricks
to use the test_bit(). Your previous code looks OK, then.
But, I started to think about something bigger. Please, see my comments below.
>
> switch (id) {
> diff --git a/include/linux/hfs_common.h b/include/linux/hfs_common.h
> index f2f70975727c..52e9b8969406 100644
> --- a/include/linux/hfs_common.h
> +++ b/include/linux/hfs_common.h
> @@ -512,7 +512,7 @@ struct hfs_btree_header_rec {
> #define HFSPLUS_BTREE_HDR_NODE_RECS_COUNT 3
> #define HFSPLUS_BTREE_HDR_MAP_REC_INDEX 2 /* Map (bitmap)
> record in header node */
> #define HFSPLUS_BTREE_HDR_USER_BYTES 128
> -#define HFSPLUS_BTREE_NODE0_BIT 0x80
> +#define HFSPLUS_BTREE_NODE0_BIT 7
>
> /* btree key type */
> #define HFSPLUS_KEY_CASEFOLDING 0xCF /* case-
> insensitive */
>
>
> > > > > #define HFSPLUS_BTREE_HDR_USER_BYTES 128
> > > > > +#define HFSPLUS_BTREE_NODE0_BIT 0x80
> > > >
> > > > Maybe, (1 << something) instead of 0x80? I am OK with constant
> > > > too.
> > >
> > > Ack'ed, will use (1 << 7). Can also use BIT(7) if you prefer.
> >
> > OK. So, are you sure that node #0 corresponds to bit #7 but not bit
> > #0? :) I am
> > started to doubt that we are correct here.
>
> According to Apple's Tech Note TN1150 [1], HFS+ maps Node 0 to the Most
> Significant Bit (MSB).
>
> Specifically, TN1150 states:
>
> 1. Under "Map Record": "The bits are interpreted in the same way as the
> bits in the allocation file."
> 2. Under "Allocation File": "Within each byte, the most significant bit
> holds information about the allocation block with the lowest number..."
>
> Apple also provides a C code snippet in the document (Listing 1)
> demonstrating this:
> `return (thisByte & (1 << (7 - (thisAllocationBlock % 8)))) != 0;`
>
> If we evaluate this for Node 0 (`thisAllocationBlock = 0`), it resolves
> to `1 << 7` (or bit 7). Therefore, checking bit 7 is mathematically
> required to target Node 0.
>
> If these changes look proper to you, I will add them to my code along
> with the rest of your feedback and send out v4 shortly.
>
> Thanks again for the guidance, continued review and for bearing with me
> on this!
>
As far as I can see, the main logic of working with b-tree map is here [1].
Ideally, we should have a generic method that can be used as here [1] as in your
patch. Could you try to make the search in b-tree map by generic method?
Thanks,
Slava.
[1] https://elixir.bootlin.com/linux/v6.19-rc5/source/fs/hfsplus/btree.c#L412