Re: [linus:master] [kasan] 3738290bfc: kunit.kasan.fail
From: Andrey Konovalov
Date: Wed Jan 08 2025 - 11:04:08 EST
On Wed, Jan 8, 2025 at 8:04 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote:
>
>
>
> Hello,
>
>
> we found the new added test kmalloc_track_caller_oob_right randomly failed
> (10 out of 30 runs) which seems due to below (1)
>
> 1857099c18e16a72 3738290bfc99606787f515a4590
> ---------------- ---------------------------
> fail:runs %reproduction fail:runs
> | | |
> :30 33% 10:30 kunit.kasan.fail
> :30 33% 10:30 dmesg.BUG:KFENCE:memory_corruption_in_kmalloc_track_caller_oob_right <-- (1)
>
> below are details.
>
>
> kernel test robot noticed "kunit.kasan.fail" on:
>
> commit: 3738290bfc99606787f515a4590ad38dc4f79ca4 ("kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [test failed on linus/master 0bc21e701a6ffacfdde7f04f87d664d82e8a13bf]
> [test failed on linux-next/master 8155b4ef3466f0e289e8fcc9e6e62f3f4dceeac2]
>
> in testcase: kunit
> version:
> with following parameters:
>
> group: group-03
>
>
>
> config: x86_64-rhel-9.4-kunit
> compiler: gcc-12
> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz (Kaby Lake) with 32G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> | Closes: https://lore.kernel.org/oe-lkp/202501081209.b7d8b735-lkp@xxxxxxxxx
>
>
>
> [ 117.724741] ok 3 kmalloc_node_oob_right
> [ 117.724849] ==================================================================
> [ 117.737591] BUG: KASAN: slab-out-of-bounds in kmalloc_track_caller_oob_right+0x4ca/0x530 [kasan_test]
> [ 117.747467] Write of size 1 at addr ffff888165906078 by task kunit_try_catch/3613
>
> [ 117.757782] CPU: 7 UID: 0 PID: 3613 Comm: kunit_try_catch Tainted: G B W N 6.12.0-rc6-00221-g3738290bfc99 #1
> [ 117.769291] Tainted: [B]=BAD_PAGE, [W]=WARN, [N]=TEST
> [ 117.775007] Hardware name: Dell Inc. OptiPlex 7050/062KRH, BIOS 1.2.0 12/22/2016
> [ 117.783056] Call Trace:
> [ 117.786185] <TASK>
> [ 117.788966] dump_stack_lvl+0x4f/0x70
> [ 117.793307] print_address_description.constprop.0+0x2c/0x3a0
> [ 117.799721] ? kmalloc_track_caller_oob_right+0x4ca/0x530 [kasan_test]
> [ 117.806918] print_report+0xb9/0x280
> [ 117.811183] ? kasan_addr_to_slab+0x9/0x90
> [ 117.815961] ? kmalloc_track_caller_oob_right+0x4ca/0x530 [kasan_test]
> [ 117.823154] kasan_report+0xcb/0x100
> [ 117.827408] ? kmalloc_track_caller_oob_right+0x4ca/0x530 [kasan_test]
> [ 117.834602] kmalloc_track_caller_oob_right+0x4ca/0x530 [kasan_test]
> [ 117.841626] ? __pfx_kmalloc_track_caller_oob_right+0x10/0x10 [kasan_test]
> [ 117.849166] ? __schedule+0x716/0x15e0
> [ 117.853589] ? ktime_get_ts64+0x7f/0x240
> [ 117.858186] kunit_try_run_case+0x173/0x440
> [ 117.863043] ? try_to_wake_up+0x913/0x1580
> [ 117.867813] ? __pfx_kunit_try_run_case+0x10/0x10
> [ 117.873187] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> [ 117.878915] ? set_cpus_allowed_ptr+0x81/0xb0
> [ 117.883956] ? __pfx_set_cpus_allowed_ptr+0x10/0x10
> [ 117.889502] ? __pfx_kunit_try_run_case+0x10/0x10
> [ 117.894876] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
> [ 117.901633] kunit_generic_run_threadfn_adapter+0x79/0xe0
> [ 117.907698] kthread+0x2d4/0x3c0
> [ 117.911604] ? __pfx_kthread+0x10/0x10
> [ 117.916032] ret_from_fork+0x2d/0x70
> [ 117.920291] ? __pfx_kthread+0x10/0x10
> [ 117.924718] ret_from_fork_asm+0x1a/0x30
> [ 117.929324] </TASK>
>
> [ 117.934373] Allocated by task 3613:
> [ 117.938544] kasan_save_stack+0x1c/0x40
> [ 117.943062] kasan_save_track+0x10/0x30
> [ 117.947574] __kasan_kmalloc+0xa6/0xb0
> [ 117.951998] __kmalloc_node_track_caller_noprof+0x1bd/0x470
> [ 117.958239] kmalloc_track_caller_oob_right+0x8c/0x530 [kasan_test]
> [ 117.965176] kunit_try_run_case+0x173/0x440
> [ 117.970031] kunit_generic_run_threadfn_adapter+0x79/0xe0
> [ 117.976097] kthread+0x2d4/0x3c0
> [ 117.980000] ret_from_fork+0x2d/0x70
> [ 117.984251] ret_from_fork_asm+0x1a/0x30
>
> [ 117.991022] The buggy address belongs to the object at ffff888165906000
> which belongs to the cache kmalloc-128 of size 128
> [ 118.004873] The buggy address is located 0 bytes to the right of
> allocated 120-byte region [ffff888165906000, ffff888165906078)
>
> [ 118.021331] The buggy address belongs to the physical page:
> [ 118.027566] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x165906
> [ 118.036221] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [ 118.044530] ksm flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> [ 118.052494] page_type: f5(slab)
> [ 118.056314] raw: 0017ffffc0000040 ffff888100042a00 ffffea00202bd080 0000000000000003
> [ 118.064708] raw: 0000000000000000 0000000080200020 00000001f5000000 0000000000000000
> [ 118.073102] head: 0017ffffc0000040 ffff888100042a00 ffffea00202bd080 0000000000000003
> [ 118.081581] head: 0000000000000000 0000000080200020 00000001f5000000 0000000000000000
> [ 118.090061] head: 0017ffffc0000001 ffffea0005964181 ffffffffffffffff 0000000000000000
> [ 118.098541] head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
> [ 118.107021] page dumped because: kasan: bad access detected
>
> [ 118.115431] Memory state around the buggy address:
> [ 118.120904] ffff888165905f00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 118.128782] ffff888165905f80: fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc
> [ 118.136658] >ffff888165906000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc
> [ 118.144535] ^
> [ 118.152323] ffff888165906080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 118.160211] ffff888165906100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
> [ 118.168100] ==================================================================
> [ 118.176059] # kmalloc_track_caller_oob_right: EXPECTATION FAILED at mm/kasan/kasan_test_c.c:243
> KASAN failure expected in "ptr[size] = 'y'", but none occurred
> [ 118.176103] ==================================================================
> [ 118.201544] BUG: KFENCE: memory corruption in kmalloc_track_caller_oob_right+0x27b/0x530 [kasan_test]
>
> [ 118.213582] Corrupted memory at 0x00000000e59a4b3f [ ! . . . . . . . . . . . . . . . ] (in kfence-#20):
> [ 118.223645] kmalloc_track_caller_oob_right+0x27b/0x530 [kasan_test]
> [ 118.230667] kunit_try_run_case+0x173/0x440
> [ 118.235525] kunit_generic_run_threadfn_adapter+0x79/0xe0
> [ 118.241590] kthread+0x2d4/0x3c0
> [ 118.245497] ret_from_fork+0x2d/0x70
> [ 118.249748] ret_from_fork_asm+0x1a/0x30
>
> [ 118.256520] kfence-#20: 0x0000000036299d7e-0x000000000c1813d3, size=120, cache=kmalloc-128
>
> [ 118.267597] allocated by task 3613 on cpu 7 at 118.176015s (0.091581s ago):
> [ 118.275220] kmalloc_track_caller_oob_right+0x190/0x530 [kasan_test]
> [ 118.282241] kunit_try_run_case+0x173/0x440
> [ 118.287100] kunit_generic_run_threadfn_adapter+0x79/0xe0
> [ 118.293166] kthread+0x2d4/0x3c0
> [ 118.297071] ret_from_fork+0x2d/0x70
> [ 118.301322] ret_from_fork_asm+0x1a/0x30
>
> [ 118.308107] freed by task 3613 on cpu 7 at 118.176094s (0.132012s ago):
> [ 118.315381] kmalloc_track_caller_oob_right+0x27b/0x530 [kasan_test]
> [ 118.322403] kunit_try_run_case+0x173/0x440
> [ 118.327260] kunit_generic_run_threadfn_adapter+0x79/0xe0
> [ 118.333327] kthread+0x2d4/0x3c0
> [ 118.337233] ret_from_fork+0x2d/0x70
> [ 118.341482] ret_from_fork_asm+0x1a/0x30
>
> [ 118.348258] CPU: 7 UID: 0 PID: 3613 Comm: kunit_try_catch Tainted: G B W N 6.12.0-rc6-00221-g3738290bfc99 #1
> [ 118.359770] Tainted: [B]=BAD_PAGE, [W]=WARN, [N]=TEST
> [ 118.365490] Hardware name: Dell Inc. OptiPlex 7050/062KRH, BIOS 1.2.0 12/22/2016
> [ 118.373542] ==================================================================
> [ 118.381677] not ok 4 kmalloc_track_caller_oob_right
+Marco and Alexander
Looks like KFENCE hijacked the allocation and reported the OOB instead
of KASAN. There's a KASAN issue filed for this problem [1], but no
solution implemented in the kernel so far.
Perhaps, it makes sense to disable KFENCE when running the KASAN test
suite on kernel test robot for now?
Thank you!
[1] https://bugzilla.kernel.org/show_bug.cgi?id=212479