Re: linux-next: build failure after merge of the tip tree

From: Ojaswin Mujoo

Date: Wed Mar 11 2026 - 10:39:32 EST


On Wed, Mar 11, 2026 at 11:37:43AM +0100, Peter Zijlstra wrote:
> On Wed, Mar 11, 2026 at 12:00:20AM +0000, Mark Brown wrote:
> > On Tue, Mar 10, 2026 at 06:28:30PM +0000, Mark Brown wrote:
> > > Hi all,
> > >
> > > After merging the tip tree, today's linux-next started crashing running
> > > arm64 KUnit like this:
> > >
> > > [18:12:16] [PASSED] split unwrit extent to 3 extents and convert 2nd half writ (non-endio, zeroout) (highlevel)
> > > [18:12:16] =============== [PASSED] test_split_convert ================
> > > [18:12:16] ================ [PASSED] ext4_extents_test ================
> > > [18:12:16] ============== ext4_mballoc_test (7 subtests) ==============
> > > Command '['qemu-system-aarch64', '-nodefaults', '-m', '1024', '-kernel', '/tmp/next/arm64_kunit/arch/arm64/boot/Image.gz', '-append', 'kunit.enable=1 console=ttyAMA0 kunit_shutdown=reboot', '-no-reboot', '-nographic', '-accel', 'kvm', '-accel', 'hvf', '-accel', 'tcg', '-serial', 'stdio', '-machine', 'virt', '-cpu', 'max']' timed out after 300 seconds
> > >
> > > I didn't figure out what the source of the issue was, I merged the tip
> > > tree from 20260309 instead.
> >
> > I tried to leave a bisect running but it got confused because a lot of
> > the branches are based on v7.0-rc1 which has a separate bug that causes
> > KUnit to lock up so the results are nonsense. I did confirm an issue
> > with just tip/master. My KUnit command line running on current Debian
> > stable is:
> >
> > ./tools/testing/kunit/kunit.py run --alltests --arch arm64 --cross_compile=aarch64-linux-gnu-
> >
> > and I also tried:
> >
> > ./tools/testing/kunit/kunit.py run --alltests --arch x86_64 --cross_compile=x86_64-linux-gnu-
> >
> > and got:
> >
> > [23:51:03] [PASSED] split unwrit extent to 3 extents and convert 2nd half writ (non-endio, zeroout) (highlevel)
> > [23:51:03] =============== [PASSED] test_split_convert ================
> > [23:51:03] ================ [PASSED] ext4_extents_test ================
> > [23:51:03] ============== ext4_mballoc_test (7 subtests) ==============
> > [23:51:03] ================= test_new_blocks_simple ==================
> > [23:51:03] [FAILED] block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64
>
>
> Right, so I bisected this using:
>
> ./tools/testing/kunit/kunit.py run --alltests --build_dir=$PWD/kunit-build/ --arch=x86_64 ext4_*
>
> and hit:
>
> 25500ba7e77c ("locking/mutex: Remove the list_head from struct mutex")
>
> After much staring, I couldn't find anything wrong with it, and decided
> to add a few DEBUG options on. And it magically started working.
>
> Then I did a KASAN run of the above, and that got me the below.
> There seems to have been some recent commits in this area, Cc'ed
> relevant people.
>
> [11:17:27] ==================================================================
> [11:17:27] BUG: KASAN: slab-use-after-free in __percpu_counter_init_many+0x21b/0x2f0
> [11:17:27] Write of size 8 at addr ffff8880029425a8 by task kunit_try_catch/37

Hi Peter,

I believe this issue is resolved with [1]. However, I think this might
not be the same as the issue reported originally as we the original
issue seems to be in mballoc-tests rather than extent-tests. I'll try to
look into it a bit more.

[1] https://lore.kernel.org/linux-ext4/5bb9041471dab8ce870c191c19cbe4df57473be8.1772381213.git.ritesh.list@xxxxxxxxx/

Regards,
ojaswin

> [11:17:27]
> [11:17:27] CPU: 0 UID: 0 PID: 37 Comm: kunit_try_catch Tainted: G N 7.0.0-rc1-00023-g25500ba7e77c #3 PREEMPT(lazy)
> [11:17:27] Tainted: [N]=TEST
> [11:17:27] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.1 11/11/2019
> [11:17:27] Call Trace:
> [11:17:27] <TASK>
> [11:17:27] dump_stack_lvl+0x4e/0x70
> [11:17:27] print_report+0x152/0x4b0
> [11:17:27] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> [11:17:27] ? __pfx_mutex_unlock+0x10/0x10
> [11:17:27] ? __percpu_counter_init_many+0x21b/0x2f0
> [11:17:27] kasan_report+0xe0/0x110
> [11:17:27] ? __percpu_counter_init_many+0x21b/0x2f0
> [11:17:27] __percpu_counter_init_many+0x21b/0x2f0
> [11:17:27] ext4_es_register_shrinker+0x115/0x3e0
> [11:17:27] ? kasan_save_track+0x14/0x30
> [11:17:27] extents_kunit_init+0x1d1/0x890
> [11:17:27] kunit_try_run_case+0x170/0x2d0
> [11:17:27] ? __pfx_kunit_try_run_case+0x10/0x10
> [11:17:27] ? kthread_affine_node+0x1b3/0x250
> [11:17:27] ? __pfx_kthread_affine_node+0x10/0x10
> [11:17:27] ? __pfx_kunit_try_run_case+0x10/0x10
> [11:17:27] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
> [11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
> [11:17:27] kthread+0x2dc/0x3c0
> [11:17:27] ? recalc_sigpending+0x15d/0x1e0
> [11:17:27] ? __pfx_kthread+0x10/0x10
> [11:17:27] ret_from_fork+0x445/0x610
> [11:17:27] ? __pfx_ret_from_fork+0x10/0x10
> [11:17:27] ? __switch_to+0x31/0xd60
> [11:17:27] ? __switch_to_asm+0x39/0x70
> [11:17:27] ? __switch_to_asm+0x33/0x70
> [11:17:27] ? __pfx_kthread+0x10/0x10
> [11:17:27] ret_from_fork_asm+0x1a/0x30
> [11:17:27] </TASK>
> [11:17:27]
> [11:17:27] Allocated by task 35:
> [11:17:27] kasan_save_stack+0x30/0x50
> [11:17:27] kasan_save_track+0x14/0x30
> [11:17:27] __kasan_kmalloc+0x8f/0xa0
> [11:17:27] extents_kunit_init+0xf0/0x890
> [11:17:27] kunit_try_run_case+0x170/0x2d0
> [11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
> [11:17:27] kthread+0x2dc/0x3c0
> [11:17:27] ret_from_fork+0x445/0x610
> [11:17:27] ret_from_fork_asm+0x1a/0x30
> [11:17:27]
> [11:17:27] Freed by task 36:
> [11:17:27] kasan_save_stack+0x30/0x50
> [11:17:27] kasan_save_track+0x14/0x30
> [11:17:27] kasan_save_free_info+0x3b/0x60
> [11:17:27] __kasan_slab_free+0x43/0x70
> [11:17:27] kfree+0x130/0x330
> [11:17:27] extents_kunit_exit+0x5b/0x90
> [11:17:27] kunit_try_run_case_cleanup+0xad/0xe0
> [11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
> [11:17:27] kthread+0x2dc/0x3c0
> [11:17:27] ret_from_fork+0x445/0x610
> [11:17:27] ret_from_fork_asm+0x1a/0x30
> [11:17:27]
> [11:17:27] The buggy address belongs to the object at ffff888002942000
> [11:17:27] which belongs to the cache kmalloc-4k of size 4096
> [11:17:27] The buggy address is located 1448 bytes inside of
> [11:17:27] freed 4096-byte region [ffff888002942000, ffff888002943000)
> [11:17:27]
> [11:17:27] The buggy address belongs to the physical page:
> [11:17:27] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2940
> [11:17:27] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [11:17:27] flags: 0x4000000000000040(head|zone=1)
> [11:17:27] page_type: f5(slab)
> [11:17:27] raw: 4000000000000040 ffff888001041d00 dead000000000100 dead000000000122
> [11:17:27] raw: 0000000000000000 0000000000040004 00000000f5000000 0000000000000000
> [11:17:27] head: 4000000000000040 ffff888001041d00 dead000000000100 dead000000000122
> [11:17:27] head: 0000000000000000 0000000000040004 00000000f5000000 0000000000000000
> [11:17:27] head: 4000000000000003 ffffea00000a5001 00000000ffffffff 00000000ffffffff
> [11:17:27] head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [11:17:27] page dumped because: kasan: bad access detected
> [11:17:27]
> [11:17:27] Memory state around the buggy address:
> [11:17:27] ffff888002942480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [11:17:27] ffff888002942500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [11:17:27] >ffff888002942580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [11:17:27] ^
> [11:17:27] ffff888002942600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [11:17:27] ffff888002942680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [11:17:27] ==================================================================
> [11:17:27] Disabling lock debugging due to kernel taint
> [11:17:27] # [extent 0] exp: lblk:10 len:1 unwrit:1
> [11:17:27] # [extent 0] got: lblk:10 len:1 unwrit:1
> [11:17:27] ------------------
> [11:17:27] # [extent 1] exp: lblk:11 len:2 unwrit:0
> [11:17:27] # [extent 1] got: lblk:11 len:2 unwrit:0
> [11:17:27] ------------------
> [11:17:27] [FAILED] split unwrit extent to 2 extents and convert 2nd half writ
>