RE: [PATCH v2 1/2] hfs: Validate CNIDs in hfs_read_inode

From: Viacheslav Dubeyko

Date: Mon Nov 10 2025 - 19:50:59 EST

On Tue, 2025-11-11 at 00:00 +0000, George Anthony Vernon wrote:
> On Tue, Nov 04, 2025 at 10:34:15PM +0000, Viacheslav Dubeyko wrote:
> > On Tue, 2025-11-04 at 01:47 +0000, George Anthony Vernon wrote:
> > > hfs_read_inode previously did not validate CNIDs read from disk, thereby
> > > allowing inodes to be constructed with disallowed CNIDs and placed on
> > > the dirty list, eventually hitting a bug on writeback.
> > >
> > > Validate reserved CNIDs according to Apple technical note TN1150.
> >
> > The TN1150 technical note describes HFS+ file system and it needs to take into
> > account the difference between HFS and HFS+. So, it is not completely correct
> > for the case of HFS to follow to the TN1150 technical note as it is.
>
> I've checked Inside Macintosh: Files Chapter 2 page 70 to make sure HFS
> is the same (CNIDs 1 - 5 are assigned, and all of 1-15 are reserved).
> I will add this to the commit message for V3.
>
> > >
> > > This issue was discussed at length on LKML previously, the discussion
> > > is linked below.
> > >
> > > Syzbot tested this patch on mainline and the bug did not replicate.
> > > This patch was regression tested by issuing various system calls on a
> > > mounted HFS filesystem and validating that file creation, deletion,
> > > reads and writes all work.
> > >
> > > Link: https://lore.kernel.org/all/427fcb57-8424-4e52-9f21-7041b2c4ae5b@
> > > I-love.SAKURA.ne.jp/T/
> > > Reported-by: syzbot+97e301b4b82ae803d21b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Closes: https://syzkaller.appspot.com/bug?extid=97e301b4b82ae803d21b
> > > Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> > > Tested-by: syzbot+97e301b4b82ae803d21b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Signed-off-by: George Anthony Vernon <contact@xxxxxxxxxxx>
> > > ---
> > > fs/hfs/inode.c | 67 +++++++++++++++++++++++++++++++++++++++-----------
> > > 1 file changed, 53 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
> > > index 9cd449913dc8..bc346693941d 100644
> > > --- a/fs/hfs/inode.c
> > > +++ b/fs/hfs/inode.c
> > > @@ -321,6 +321,38 @@ static int hfs_test_inode(struct inode *inode, void *data)
> > > }
> > > }
> > >
> > > +/*
> > > + * is_valid_cnid
> > > + *
> > > + * Validate the CNID of a catalog record
> > > + */
> > > +static inline
> > > +bool is_valid_cnid(u32 cnid, u8 type)
> > > +{
> > > + if (likely(cnid >= HFS_FIRSTUSER_CNID))
> > > + return true;
> > > +
> > > + switch (cnid) {
> > > + case HFS_ROOT_CNID:
> > > + return type == HFS_CDR_DIR;
> > > + case HFS_EXT_CNID:
> > > + case HFS_CAT_CNID:
> > > + return type == HFS_CDR_FIL;
> > > + case HFS_POR_CNID:
> > > + /* No valid record with this CNID */
> > > + break;
> > > + case HFS_BAD_CNID:
> >
> > HFS is ancient file system that was needed to work with floppy disks. And bad
> > sectors management was regular task and responsibility of HFS for the case of
> > floppy disks (HDD was also not very reliable at that times). So, HFS implements
> > the bad block management. It means that, potentially, Linux kernel could need to
> > mount a file system volume that created by ancient Mac OS.
> >
> > I don't think that it's correct management of HFS_BAD_CNID. We must to expect to
> > have such CNID for the case of HFS.
> >
>
> HFS_BAD_CNID is reserved for internal use of the filesystem
> implementation. Since we never intend to use it, there is no correct
> logical path that we should ever construct an inode with CNID 5. It does
> not correspond to a record that the user can open, as it is a special
> CNID used only for extent records used to mark blocks as allocated so
> they are not used, a behaviour which we do not implement in the Linux
> HFS or HFS+ drivers. Disallowing this CNID will not prevent correctly
> formed filesystems from being mounted. I also don't think that
> presenting an internal record-keeping structure to the VFS would make
> sense or would be consistent with other filesystems.
>

Yes, we don't want to use it on Linux kernel side. But, potentially, the file
for bad block management could or was been created/used on Mac OS side. And if
anyone tries to mount the HFS volume with the bad block file, then we will
refuse to mount it on the Linux kernel side. This is my main worry here. But it
is not very probable situation, by the way.

> > > + case HFS_EXCH_CNID:
> > > + /* Not implemented */
> > > + break;
> > > + default:
> > > + /* Invalid reserved CNID */
> > > + break;
> > > + }
> > > +
> > > + return false;
> > > +}
> > > +
> > > /*
> > > * hfs_read_inode
> > > */
> > > @@ -350,6 +382,8 @@ static int hfs_read_inode(struct inode *inode, void *data)
> > > rec = idata->rec;
> > > switch (rec->type) {
> > > case HFS_CDR_FIL:
> > > + if (!is_valid_cnid(rec->file.FlNum, HFS_CDR_FIL))
> > > + goto make_bad_inode;
> > > if (!HFS_IS_RSRC(inode)) {
> > > hfs_inode_read_fork(inode, rec->file.ExtRec, rec->file.LgLen,
> > > rec->file.PyLen, be16_to_cpu(rec->file.ClpSize));
> > > @@ -371,6 +405,8 @@ static int hfs_read_inode(struct inode *inode, void *data)
> > > inode->i_mapping->a_ops = &hfs_aops;
> > > break;
> > > case HFS_CDR_DIR:
> > > + if (!is_valid_cnid(rec->dir.DirID, HFS_CDR_DIR))
> > > + goto make_bad_inode;
> > > inode->i_ino = be32_to_cpu(rec->dir.DirID);
> > > inode->i_size = be16_to_cpu(rec->dir.Val) + 2;
> > > HFS_I(inode)->fs_blocks = 0;
> > > @@ -380,8 +416,12 @@ static int hfs_read_inode(struct inode *inode, void *data)
> > > inode->i_op = &hfs_dir_inode_operations;
> > > inode->i_fop = &hfs_dir_operations;
> > > break;
> > > + make_bad_inode:
> > > + pr_warn("rejected cnid %lu. Volume is probably corrupted, try performing fsck.\n", inode->i_ino);
> >
> > The "invalid cnid" could sound more relevant than "rejected cnid" for my taste.
> >
> > The whole message is too long. What's about to have two messages here?
> >
> > pr_warn("invalid cnid %lu\n", inode->i_ino);
> > pr_warn("Volume is probably corrupted, try performing fsck.\n");
> >
> Good improvement!
> >
> > > + fallthrough;
> > > default:
> > > make_bad_inode(inode);
> > > + break;
> > > }
> > > return 0;
> > > }
> > > @@ -441,20 +481,19 @@ int hfs_write_inode(struct inode *inode, struct writeback_control *wbc)
> > > if (res)
> > > return res;
> > >
> > > - if (inode->i_ino < HFS_FIRSTUSER_CNID) {
> > > - switch (inode->i_ino) {
> > > - case HFS_ROOT_CNID:
> > > - break;
> > > - case HFS_EXT_CNID:
> > > - hfs_btree_write(HFS_SB(inode->i_sb)->ext_tree);
> > > - return 0;
> > > - case HFS_CAT_CNID:
> > > - hfs_btree_write(HFS_SB(inode->i_sb)->cat_tree);
> > > - return 0;
> > > - default:
> > > - BUG();
> > > - return -EIO;
> > > - }
> > > + if (!is_valid_cnid(inode->i_ino,
> > > + S_ISDIR(inode->i_mode) ? HFS_CDR_DIR : HFS_CDR_FIL))
> >
> > What's about to introduce static inline function or local variable for
> > S_ISDIR(inode->i_mode) ? HFS_CDR_DIR : HFS_CDR_FIL? I don't like this two line
> > implementation.
>
> Okay, I will rewrite this.
>
> >
> > > + BUG();
> >
> > I am completely against of leaving BUG() here. Several fixes of syzbot issues
> > were the exchanging BUG() on returning error code. I don't want to investigate
> > the another syzbot issue that will involve this BUG() here. Let's return error
> > code here.
> >
> > Usually, it makes sense to have BUG() for debug mode and to return error code
> > for the case of release mode. But we don't have the debug mode for HFS code.
>
> I prefer BUG() because I think it is a serious bug that we should not
> allow for a bad inode to be written. I am willing to take responsibility
> for investigating further issues if they appear as a result of this. Of
> course, the final say on BUG() or -EIO is yours as the maintainer.
>
> >

The hfs_write_inode() will return error code and it is also bug case. But we
will not crash the kernel in such case. Why would you still like to crash the
kernel? :)

I see your point that we should not be here because we must create the bad inode
in hfs_read_inode() for the case of corrupted Catalog File's records. Let me
sleep on it. :)

Thanks,
Slava.