Re: linux-next: build failure after merge of the tip tree

From: Peter Zijlstra

Date: Wed Mar 11 2026 - 06:46:51 EST


On Wed, Mar 11, 2026 at 12:00:20AM +0000, Mark Brown wrote:
> On Tue, Mar 10, 2026 at 06:28:30PM +0000, Mark Brown wrote:
> > Hi all,
> >
> > After merging the tip tree, today's linux-next started crashing running
> > arm64 KUnit like this:
> >
> > [18:12:16] [PASSED] split unwrit extent to 3 extents and convert 2nd half writ (non-endio, zeroout) (highlevel)
> > [18:12:16] =============== [PASSED] test_split_convert ================
> > [18:12:16] ================ [PASSED] ext4_extents_test ================
> > [18:12:16] ============== ext4_mballoc_test (7 subtests) ==============
> > Command '['qemu-system-aarch64', '-nodefaults', '-m', '1024', '-kernel', '/tmp/next/arm64_kunit/arch/arm64/boot/Image.gz', '-append', 'kunit.enable=1 console=ttyAMA0 kunit_shutdown=reboot', '-no-reboot', '-nographic', '-accel', 'kvm', '-accel', 'hvf', '-accel', 'tcg', '-serial', 'stdio', '-machine', 'virt', '-cpu', 'max']' timed out after 300 seconds
> >
> > I didn't figure out what the source of the issue was, I merged the tip
> > tree from 20260309 instead.
>
> I tried to leave a bisect running but it got confused because a lot of
> the branches are based on v7.0-rc1 which has a separate bug that causes
> KUnit to lock up so the results are nonsense. I did confirm an issue
> with just tip/master. My KUnit command line running on current Debian
> stable is:
>
> ./tools/testing/kunit/kunit.py run --alltests --arch arm64 --cross_compile=aarch64-linux-gnu-
>
> and I also tried:
>
> ./tools/testing/kunit/kunit.py run --alltests --arch x86_64 --cross_compile=x86_64-linux-gnu-
>
> and got:
>
> [23:51:03] [PASSED] split unwrit extent to 3 extents and convert 2nd half writ (non-endio, zeroout) (highlevel)
> [23:51:03] =============== [PASSED] test_split_convert ================
> [23:51:03] ================ [PASSED] ext4_extents_test ================
> [23:51:03] ============== ext4_mballoc_test (7 subtests) ==============
> [23:51:03] ================= test_new_blocks_simple ==================
> [23:51:03] [FAILED] block_bits=10 cluster_bits=3 blocks_per_group=8192 group_count=4 desc_size=64


Right, so I bisected this using:

./tools/testing/kunit/kunit.py run --alltests --build_dir=$PWD/kunit-build/ --arch=x86_64 ext4_*

and hit:

25500ba7e77c ("locking/mutex: Remove the list_head from struct mutex")

After much staring, I couldn't find anything wrong with it, and decided
to add a few DEBUG options on. And it magically started working.

Then I did a KASAN run of the above, and that got me the below.
There seems to have been some recent commits in this area, Cc'ed
relevant people.

[11:17:27] ==================================================================
[11:17:27] BUG: KASAN: slab-use-after-free in __percpu_counter_init_many+0x21b/0x2f0
[11:17:27] Write of size 8 at addr ffff8880029425a8 by task kunit_try_catch/37
[11:17:27]
[11:17:27] CPU: 0 UID: 0 PID: 37 Comm: kunit_try_catch Tainted: G N 7.0.0-rc1-00023-g25500ba7e77c #3 PREEMPT(lazy)
[11:17:27] Tainted: [N]=TEST
[11:17:27] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.1 11/11/2019
[11:17:27] Call Trace:
[11:17:27] <TASK>
[11:17:27] dump_stack_lvl+0x4e/0x70
[11:17:27] print_report+0x152/0x4b0
[11:17:27] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[11:17:27] ? __pfx_mutex_unlock+0x10/0x10
[11:17:27] ? __percpu_counter_init_many+0x21b/0x2f0
[11:17:27] kasan_report+0xe0/0x110
[11:17:27] ? __percpu_counter_init_many+0x21b/0x2f0
[11:17:27] __percpu_counter_init_many+0x21b/0x2f0
[11:17:27] ext4_es_register_shrinker+0x115/0x3e0
[11:17:27] ? kasan_save_track+0x14/0x30
[11:17:27] extents_kunit_init+0x1d1/0x890
[11:17:27] kunit_try_run_case+0x170/0x2d0
[11:17:27] ? __pfx_kunit_try_run_case+0x10/0x10
[11:17:27] ? kthread_affine_node+0x1b3/0x250
[11:17:27] ? __pfx_kthread_affine_node+0x10/0x10
[11:17:27] ? __pfx_kunit_try_run_case+0x10/0x10
[11:17:27] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
[11:17:27] kthread+0x2dc/0x3c0
[11:17:27] ? recalc_sigpending+0x15d/0x1e0
[11:17:27] ? __pfx_kthread+0x10/0x10
[11:17:27] ret_from_fork+0x445/0x610
[11:17:27] ? __pfx_ret_from_fork+0x10/0x10
[11:17:27] ? __switch_to+0x31/0xd60
[11:17:27] ? __switch_to_asm+0x39/0x70
[11:17:27] ? __switch_to_asm+0x33/0x70
[11:17:27] ? __pfx_kthread+0x10/0x10
[11:17:27] ret_from_fork_asm+0x1a/0x30
[11:17:27] </TASK>
[11:17:27]
[11:17:27] Allocated by task 35:
[11:17:27] kasan_save_stack+0x30/0x50
[11:17:27] kasan_save_track+0x14/0x30
[11:17:27] __kasan_kmalloc+0x8f/0xa0
[11:17:27] extents_kunit_init+0xf0/0x890
[11:17:27] kunit_try_run_case+0x170/0x2d0
[11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
[11:17:27] kthread+0x2dc/0x3c0
[11:17:27] ret_from_fork+0x445/0x610
[11:17:27] ret_from_fork_asm+0x1a/0x30
[11:17:27]
[11:17:27] Freed by task 36:
[11:17:27] kasan_save_stack+0x30/0x50
[11:17:27] kasan_save_track+0x14/0x30
[11:17:27] kasan_save_free_info+0x3b/0x60
[11:17:27] __kasan_slab_free+0x43/0x70
[11:17:27] kfree+0x130/0x330
[11:17:27] extents_kunit_exit+0x5b/0x90
[11:17:27] kunit_try_run_case_cleanup+0xad/0xe0
[11:17:27] kunit_generic_run_threadfn_adapter+0x7b/0xe0
[11:17:27] kthread+0x2dc/0x3c0
[11:17:27] ret_from_fork+0x445/0x610
[11:17:27] ret_from_fork_asm+0x1a/0x30
[11:17:27]
[11:17:27] The buggy address belongs to the object at ffff888002942000
[11:17:27] which belongs to the cache kmalloc-4k of size 4096
[11:17:27] The buggy address is located 1448 bytes inside of
[11:17:27] freed 4096-byte region [ffff888002942000, ffff888002943000)
[11:17:27]
[11:17:27] The buggy address belongs to the physical page:
[11:17:27] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2940
[11:17:27] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[11:17:27] flags: 0x4000000000000040(head|zone=1)
[11:17:27] page_type: f5(slab)
[11:17:27] raw: 4000000000000040 ffff888001041d00 dead000000000100 dead000000000122
[11:17:27] raw: 0000000000000000 0000000000040004 00000000f5000000 0000000000000000
[11:17:27] head: 4000000000000040 ffff888001041d00 dead000000000100 dead000000000122
[11:17:27] head: 0000000000000000 0000000000040004 00000000f5000000 0000000000000000
[11:17:27] head: 4000000000000003 ffffea00000a5001 00000000ffffffff 00000000ffffffff
[11:17:27] head: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[11:17:27] page dumped because: kasan: bad access detected
[11:17:27]
[11:17:27] Memory state around the buggy address:
[11:17:27] ffff888002942480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[11:17:27] ffff888002942500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[11:17:27] >ffff888002942580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[11:17:27] ^
[11:17:27] ffff888002942600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[11:17:27] ffff888002942680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[11:17:27] ==================================================================
[11:17:27] Disabling lock debugging due to kernel taint
[11:17:27] # [extent 0] exp: lblk:10 len:1 unwrit:1
[11:17:27] # [extent 0] got: lblk:10 len:1 unwrit:1
[11:17:27] ------------------
[11:17:27] # [extent 1] exp: lblk:11 len:2 unwrit:0
[11:17:27] # [extent 1] got: lblk:11 len:2 unwrit:0
[11:17:27] ------------------
[11:17:27] [FAILED] split unwrit extent to 2 extents and convert 2nd half writ

Attachment: signature.asc
Description: PGP signature