Re: [PATCH v2] hfs: prevent MDB and bitmap buffer_head aliasing

From: Sam Sun

Date: Fri Jun 05 2026 - 00:01:25 EST

On Fri, Jun 5, 2026 at 7:52 AM Viacheslav Dubeyko <slava@xxxxxxxxxxx> wrote:
>
> On Thu, 2026-06-04 at 14:25 -0700, Viacheslav Dubeyko wrote:
> > On Fri, 2026-06-05 at 02:02 +0800, Yue Sun wrote:
> > > On Thu, Jun 4, 2026 at 12:25 AM <slava@xxxxxxxxxxx> wrote:
> > > >
> > > > Could you please take a deeper look into my diff that I've shared
> > > > with
> > > > you before? I don't see the point to review your suggestion if
> > > > you
> > > > completely ignored my diff. From my point of view you suggested
> > > > completely the same approach but in more complicated manner.
> > >
> > > You are right. I am sorry for the confusion.
> > >
> > > I should have looked more carefully at your diff and based my reply
> > > on
> > > it. Adding an HFS-level MDB mutex and avoiding holding mdb_bh
> > > across
> > > the
> > > whole hfs_mdb_commit() path is the right direction. My previous
> > > reply
> > > mixed that with extra cleanups and a separate corrupted-layout
> > > check,
> > > which made it look like I was proposing a different approach. That
> > > was
> > > my mistake.
> > >
> > > I am not very familiar with HFS internals, so please treat the
> > > following
> > > only as a small suggestion. After re-reading your diff, the only
> > > detail
> > > I am still unsure about is the HFS_FLG_ALT_MDB_DIRTY path. In that
> > > path,
> > > hfs_inode_write_fork() gets MDB fields as output buffers:
> > >
> > > hfs_inode_write_fork(..., mdb->drXTExtRec,
> > > &mdb->drXTFlSize, NULL);
> > > hfs_inode_write_fork(..., mdb->drCTExtRec,
> > > &mdb->drCTFlSize, NULL);
> >
> >
> > I've started to think that, maybe, the location of these
> > hfs_inode_write_fork() calls is not correct. We are trying to save
> > the
> > current size of Catalog File and Extents Overflow File. OK, we need
> > to
> > update this if size is changed. But we can save the sizes during
> > every
> > call of hfs_mdb_commit(). It's logically incorrect to call these
> > methods in the alternative MDB section because it is the content of
> > primary MDB. I suggest to move these two hfs_inode_write_fork() to
> > primary MDB writing section:
> >
> > lock_buffer(HFS_SB(sb)->mdb_bh);
> > /* These parameters may have been modified, so write
> > them back */
> > mdb->drLsMod = hfs_mtime();
> > mdb->drFreeBks = cpu_to_be16(HFS_SB(sb)-
> > > free_ablocks);
> > mdb->drNxtCNID =
> > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > next_id));
> > mdb->drNmFls = cpu_to_be16(HFS_SB(sb)->root_files);
> > mdb->drNmRtDirs = cpu_to_be16(HFS_SB(sb)-
> > >root_dirs);
> > mdb->drFilCnt =
> > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > file_count));
> > mdb->drDirCnt =
> > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > folder_count));
> >
> > hfs_inode_write_fork(HFS_SB(sb)->ext_tree->inode,
> > mdb-
> > > drXTExtRec,
> > &mdb->drXTFlSize, NULL);
> > hfs_inode_write_fork(HFS_SB(sb)->cat_tree->inode,
> > mdb-
> > > drCTExtRec,
> > &mdb->drCTFlSize, NULL);
> >
> > /* write MDB to disk */
> > mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
> > unlock_buffer(HFS_SB(sb)->mdb_bh);
> >
> > In this case, we don;'t need to overlap the locks. What do you think?
> > Sounds like reasonable solution?
> >
> >
>
> Frankly speaking, I don't like this approach of assigning the pointer
> on bh->b_data to HFS_SB(sb)->mdb:
>
> data = (void *)(__bh->b_data + __offset);
>
> We manipulate by various fields of HFS_SB(sb)->mdb throughout of the
> hfs_mdb_commit(). I think that we need to allocate buffer for struct
> hfs_mdb *mdb during hfs_mdb_get() (and destroy in hfs_mdb_put()) and
> synchronize its content with struct buffer_head *mdb_bh. Otherwise, we
> never can resolve the problem of proper locking in hfs_mdb_commit().
> What do you think?
>
> Thanks,
> Slava.
>

Yes, I agree. Keeping HFS_SB(sb)->mdb as a separate in-memory copy sounds
cleaner than pointing it directly into mdb_bh->b_data.

For the immediate deadlock fix, I think moving hfs_inode_write_fork() into the
primary MDB update section as you suggested sounds like a good minimal step. The
separate in-memory MDB copy sounds like a larger cleanup that could follow if
you think that is the right direction.

Thanks,
Yue

> >
> > > Since mdb points into mdb_bh->b_data, these calls seem to update
> > > fields
> > > in the primary MDB buffer. So I was thinking that we could keep
> > > your
> > > locking scheme, but move unlock_buffer(mdb_bh) slightly later,
> > > after
> > > these primary MDB updates and mark_buffer_dirty(mdb_bh). The
> > > alternate
> > > MDB copy/sync and the bitmap writeback would still happen after
> > > mdb_bh
> > > is unlocked, so the main deadlock fix remains the same.
> > >
> > > If this concern is not valid, or if the current HFS update rules
> > > already
> > > make your shorter window sufficient, please ignore this. I defer to
> > > your
> > > judgment here.
> > >
> > > Below is your diff with only that small adjustment.
> > >
> > > Thanks,
> > > Yue
> > >
> > > diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
> > > index ac0e83f77a0f..00651f42ba38 100644
> > > --- a/fs/hfs/hfs_fs.h
> > > +++ b/fs/hfs/hfs_fs.h
> > > @@ -66,6 +66,7 @@ struct hfs_inode_info {
> > > * The HFS-specific part of a Linux (struct super_block)
> > > */
> > > struct hfs_sb_info {
> > > + struct mutex mdb_lock; /* MDB operations
> > > lock */
> > > struct buffer_head *mdb_bh; /* The hfs_buffer
> > > holding the
> > > real
> > > superblock (aka
> > > VIB
> > > diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
> > > index a97cea35ca2e..d81e3545a045 100644
> > > --- a/fs/hfs/mdb.c
> > > +++ b/fs/hfs/mdb.c
> > > @@ -287,6 +287,7 @@ int hfs_mdb_get(struct super_block *sb)
> > > void hfs_mdb_commit(struct super_block *sb)
> > > {
> > > struct hfs_mdb *mdb = HFS_SB(sb)->mdb;
> > > + bool write_backup = false;
> > >
> > > if (sb_rdonly(sb))
> > > return;
> > > @@ -314,11 +315,17 @@ void hfs_mdb_commit(struct super_block *sb)
> > > * files grow. */
> > > if (test_and_clear_bit(HFS_FLG_ALT_MDB_DIRTY, &HFS_SB(sb)-
> > > > flags) &&
> > > HFS_SB(sb)->alt_mdb) {
> > > + write_backup = true;
> > > hfs_inode_write_fork(HFS_SB(sb)->ext_tree->inode,
> > > mdb->drXTExtRec,
> > > &mdb->drXTFlSize, NULL);
> > > hfs_inode_write_fork(HFS_SB(sb)->cat_tree->inode,
> > > mdb->drCTExtRec,
> > > &mdb->drCTFlSize, NULL);
> > >
> > > + mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
> > > + }
> > > + unlock_buffer(HFS_SB(sb)->mdb_bh);
> > > +
> > > + if (write_backup) {
> > > lock_buffer(HFS_SB(sb)->alt_mdb_bh);
> > > memcpy(HFS_SB(sb)->alt_mdb, HFS_SB(sb)->mdb,
> > > HFS_SECTOR_SIZE);
> > > HFS_SB(sb)->alt_mdb->drAtrb |=
> > > cpu_to_be16(HFS_SB_ATTRIB_UNMNT);
> > > @@ -360,7 +367,6 @@ void hfs_mdb_commit(struct super_block *sb)
> > > size -= len;
> > > }
> > > }
> > > - unlock_buffer(HFS_SB(sb)->mdb_bh);
> > > }
> > >
> > > void hfs_mdb_close(struct super_block *sb)
> > > diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> > > index a4f2a2bfa6d3..06ab904b3a42 100644
> > > --- a/fs/hfs/super.c
> > > +++ b/fs/hfs/super.c
> > > @@ -35,7 +35,11 @@ MODULE_LICENSE("GPL");
> > > static int hfs_sync_fs(struct super_block *sb, int wait)
> > > {
> > > is_hfs_cnid_counts_valid(sb);
> > > +
> > > + mutex_lock(&HFS_SB(sb)->mdb_lock);
> > > hfs_mdb_commit(sb);
> > > + mutex_unlock(&HFS_SB(sb)->mdb_lock);
> > > +
> > > return 0;
> > > }
> > >
> > > @@ -68,7 +72,9 @@ static void flush_mdb(struct work_struct *work)
> > >
> > > is_hfs_cnid_counts_valid(sb);
> > >
> > > + mutex_lock(&sbi->mdb_lock);
> > > hfs_mdb_commit(sb);
> > > + mutex_unlock(&sbi->mdb_lock);
> > > }
> > >
> > > void hfs_mark_mdb_dirty(struct super_block *sb)
> > > @@ -339,9 +345,12 @@ static int hfs_fill_super(struct super_block
> > > *sb, struct fs_context *fc)
> > > sb->s_op = &hfs_super_operations;
> > > sb->s_xattr = hfs_xattr_handlers;
> > > sb->s_flags |= SB_NODIRATIME;
> > > + mutex_init(&sbi->mdb_lock);
> > > mutex_init(&sbi->bitmap_lock);
> > >
> > > + mutex_lock(&sbi->mdb_lock);
> > > res = hfs_mdb_get(sb);
> > > + mutex_unlock(&sbi->mdb_lock);
> > > if (res) {
> > > if (!silent)
> > > pr_warn("can't find a HFS filesystem on
> > > dev
> > > %s\n",