Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry

From: Baokun Li
Date: Thu Oct 10 2024 - 22:19:19 EST

Next message: Andrii Nakryiko: "Re: [PATCHv6 bpf-next 03/16] bpf: Allow return values 0 and 1 for kprobe session"
Previous message: Howard Chu: "Re: [PATCH 0/2] perf trace: Fix support for the new BPF feature in clang 12"
In reply to: Jan Kara: "Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry"
Next in thread: Jan Kara: "Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024/10/9 23:50, Jan Kara wrote:

On Tue 08-10-24 15:40:39, Baokun Li wrote:

On 2024/9/22 14:42, Qianqiang Liu wrote:

syzbot has found an out-of-bounds issue in ext4_xattr_set_entry:

==================================================================
BUG: KASAN: out-of-bounds in ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
Read of size 18446744073709551572 at addr ffff888036426850 by task syz-executor264/5095

CPU: 0 UID: 0 PID: 5095 Comm: syz-executor264 Not tainted 6.11.0-syzkaller-03917-ga940d9a43e62 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
__asan_memmove+0x29/0x70 mm/kasan/shadow.c:94
ext4_xattr_set_entry+0x8ce/0x1f60 fs/ext4/xattr.c:1781
[...]
==================================================================

This issue is caused by a negative size in memmove.
We need to check for this.

Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
Reported-by: syzbot+f792df426ff0f5ceb8d1@xxxxxxxxxxxxxxxxxxxxxxxxx
Closes: https://syzkaller.appspot.com/bug?extid=f792df426ff0f5ceb8d1
Tested-by: syzbot+f792df426ff0f5ceb8d1@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Qianqiang Liu <qianqiang.liu@xxxxxxx>
---
fs/ext4/xattr.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 46ce2f21fef9..336badb46246 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1776,7 +1776,14 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
} else if (s->not_found) {
/* Insert new name. */
size_t size = EXT4_XATTR_LEN(name_len);
- size_t rest = (void *)last - (void *)here + sizeof(__u32);
+ size_t rest;
+
+ if (last < here) {
+ ret = -ENOSPC;
+ goto out;
+ } else {
+ rest = (void *)last - (void *)here + sizeof(__u32);
+ }
memmove((void *)here + size, here, rest);
memset(here, 0, size);

This change just passes syzbot's test cases without fixing the real
problem.

The root cause of the problem is that the inode's xattr block is marked as
free in the block bitmap, so that block is allocated to the ea inode
resulting in the data in the xattr block being overwritten, and the last of
the second lookups changing resulting in out-of-bounds access.

The stack that triggers the problem is as follows:

// An inode with an xattr block of 33.
__ext4_mark_inode_dirty
__ext4_expand_extra_isize
ext4_expand_extra_isize_ea
   ext4_xattr_make_inode_space
    // Move xattr from inode to block
    ext4_xattr_move_to_block
     // Find out if the xattr exists in the block
     ext4_xattr_block_find
      // If xattr does not exist, here == last
      xattr_find_entry
     // Add a new xattr to the block
     ext4_xattr_block_set
      |// xattr is too long, needs an ea inode
      |ext4_xattr_inode_lookup_create
      | ext4_xattr_inode_create
      | ext4_xattr_inode_write
      | ext4_map_blocks
      |   // xattr block 33 is assigned to the new ea inode
      | memcpy(bh->b_data, buf, csize)
      |   // The value of xattr overwrites the data in the xattr block.
      |ext4_xattr_set_entry
       // Since the contents of the xattr block have changed,
       // now here == last does not hold, so it is possible to
       // have last < here and trigger an out-of-bounds access.

So I think we should probably add a helper function ext4_mb_block_inuse()
that checks if xattr block is free with the block bitmap in check_xattrs().

Hi Honza,

Thanks so much for your thoughts and feedback!

Well, even that would be a relatively narrow fix. You could have e.g.
file reference the xattr block as one of its data blocks and then corrupt
xattr contents at unfortunate moment. That will not get fixed by checking
whether the block is allocated. These multiply claimed blocks (as e2fsck
calls it) are very hard to detect inside the kernel.

Yes, after locating the issue, the first thought was to just get the buffer
lock and check xattr magic and xattr block checksum. However, if the block
is allocated as an xattr block to another file, the issue may still occur.

Therefore we have to make sure that the block has been allocated to the
current file. With the block bitmap we can verify that the current block
is allocated, but as you pointed out we cannot verify that it is only
allocated to the current file.

That means we need some means to find the owner of the block by block,
and then I came up with xfs Reverse-Mapping.

Or go one step further and add a mechanism like xfs Reverse-Mapping, which
makes sure that allocated blocks do point to the target inode, which could
replace the current block_validity, and could also be used in future online
fscks.

Well, that is a rather big change. It requires significant on-disk format
change and also performance cost when to maintain. Furthermore for xattr
blocks which can be shared by many inodes it is not even clear how to
implement this... So I'm not sure we really want to do this either.

Honza

Yes, there can be a lot of work involved.

* Perhaps we could create an rmap file to store the rmap tree to avoid
   on-disk format changes.
* The performance impact of maintaining rmap really needs to be evaluated,
   perhaps by writing a simple DEMO to test it.
* XFS supports shared blocks(A.K.A. reflink.), so even if the physical
   blocks are the same, but the inodes are different or the logical blocks
   are different, they will be recorded multiple times in the tree. So the
   shared xattr block can be handled similarly.

We have plans to support online fsck in the future, and implementing rmap
is one of the steps. Perhaps one can wait until rmap is implemented to
assess whether it is worth a strict check here.

Implementing rmap may take some time, until then we can avoid the problem
as much as possible by checking the magic and xattr block csum.
Maybe something like this?

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 7647e9f6e190..cd3ae1e3371c 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1676,6 +1676,13 @@ static int ext4_xattr_set_entry(struct ext4_xattr_info *i,
                }
        }

+       if (WARN_ON_ONCE(last < here)) {
+               EXT4_ERROR_INODE(inode, "corrupted xattr entries in %s",
+                                       is_block ? "block" : "ibody");
+               ret = -EFSCORRUPTED;
+               goto out;
+       }
+
        /* Check whether we have enough space. */
        if (i->value) {
                size_t free;
@@ -1923,6 +1930,7 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
        }

        if (s->base) {
+               struct ext4_xattr_header *hdr;
                int offset = (char *)s->here - bs->bh->b_data;

                BUFFER_TRACE(bs->bh, "get_write_access");
@@ -1932,6 +1940,16 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
                        goto cleanup;

                lock_buffer(bs->bh);
+               hdr = header(s->base);
+
+               if (hdr->h_magic != cpu_to_le32(EXT4_XATTR_MAGIC) ||
+                   (ext4_has_metadata_csum(inode->i_sb) &&
+                    (ext4_xattr_block_csum(inode, bs->bh->b_blocknr, hdr) !=
+                     hdr->h_checksum))) {
+                       unlock_buffer(bs->bh);
+                       error = -EFSCORRUPTED;
+                       goto bad_block;
+               }

                if (header(s->base)->h_refcount == cpu_to_le32(1)) {
                        __u32 hash = le32_to_cpu(BHDR(bs->bh)->h_hash);

--
Thanks,
Baokun

Next message: Andrii Nakryiko: "Re: [PATCHv6 bpf-next 03/16] bpf: Allow return values 0 and 1 for kprobe session"
Previous message: Howard Chu: "Re: [PATCH 0/2] perf trace: Fix support for the new BPF feature in clang 12"
In reply to: Jan Kara: "Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry"
Next in thread: Jan Kara: "Re: [PATCH] ext4: fix out-of-bounds issue in ext4_xattr_set_entry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]