Re: [PATCH] xfs: reject CRC validation when the log header cannot be retrieved

From: Brian Foster

Date: Mon Apr 06 2026 - 10:12:34 EST

On Fri, Apr 03, 2026 at 09:43:39AM +0800, Edward Adam Davis wrote:
> When the traditional algorithm fails to locate the log header, it triggers
> the uninitialized-value issue regarding tmp_rhead_blk reported in [1],
> continuing with the subsequent CRC verification traversal in such a
> scenario is futile.
>
> A check has been added to detect the absence of the log header and prevent
> the execution of the subsequent CRC verification traversal.
>
> [1]
> BUG: KMSAN: uninit-value in xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
> xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
> xlog_find_tail+0xc2e/0x1a50 fs/xfs/xfs_log_recover.c:1315
> xlog_recover+0x6d/0x800 fs/xfs/xfs_log_recover.c:3426
> xfs_log_mount+0x4da/0x880 fs/xfs/xfs_log.c:617
>
> Local variable tmp_rhead_blk created at:
> xlog_verify_head+0x81/0x910 fs/xfs/xfs_log_recover.c:1032
>
> Reported-by: syzbot+b7dfbed0c6c2b5e9fd34@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=b7dfbed0c6c2b5e9fd34
> Signed-off-by: Edward Adam Davis <eadavis@xxxxxx>
> ---
> fs/xfs/xfs_log_recover.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 09e6678ca487..0d1b4bddd193 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -1050,6 +1050,9 @@ xlog_verify_head(
> if (error < 0)
> return error;
>
> + if (!error)
> + return -EIO;
> +

Hmm.. at this point we've located the head block, pulled the tail block
from the log record header and are attempting to find the last written
log record that could have potentially been torn based on max iclogs so
we can verify it with a CRC pass.

Have you dug into how syzbot triggers this issue? The tweak seems
reasonable at a glance as I'm not sure why we wouldn't find at least one
log record header in the head/tail range, but at minimum the patch
should provide some analysis on why we should make that assumption here
and how this happens. It would also be ideal to know what's going on to
help determine whether there isn't some other issue here that might need
to be addressed.

For example, are we returning 0 here for the head verification pass and
aside from the uninit variable issue, falling into an otherwise
functional log recovery? Or does log recovery ultimately fail further
along? I'd be hesitant to blindly add an error return into a functioning
recovery situation as that might imply there's something wrong with the
verification logic, whereas maybe it's a different story if there's some
corruption or something that we're not handling gracefully enough.

FWIW, I did some LLM prodding at if/how something like this might happen
and it threw out some ideas based on records wrapping the log, but TBH
given the difficulty it has processing the details and layers of
complexity here I'm not really sure I trust it. The best bet is probably
to dig more into what the log looks like and why it triggers the issue.

Brian

> /*
> * Now run a CRC verification pass over the records starting at the
> * block found above to the current head. If a CRC failure occurs, the
> --
> 2.43.0
>
>