Re: ahci KASAN warning in experimental arm64 allmodconfig boot

From: John Garry
Date: Tue Oct 15 2019 - 10:18:02 EST


On 14/10/2019 19:02, John Garry wrote:
Hi guys,

I'm experimenting by trying to boot an allmodconfig arm64 kernel, as
mentioned here:
https://lore.kernel.org/linux-arm-kernel/507325a3-030e-2843-0f46-7e18c60257de@xxxxxxxxxx/


I am getting a KASAN slab-out-of-bounds warning around the ahci code, as
below. I am also getting a crash some time after, but the stackframe is
mixed up in the dmesg, so hard to extract that.

The kernel is v5.4-rc3.

I'll continue to look at this as time allows.

I have cc'ed the scsi list as this may be related to the SAS host (using
libsas), but I'm doubtful, considering the log mentions ahci_init_one().


It may very well be related - just CONFIG_DEBUG_TEST_DRIVER_REMOVE triggers this WARN:

[ 23.452574] ------------[ cut here ]------------
[ 23.457190] WARNING: CPU: 59 PID: 1 at drivers/ata/libata-core.c:6676 ata_host_detach+0x15c/0x168
[ 23.466047] Modules linked in:
[ 23.469092] CPU: 59 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc1-00010-g5b83fd27752b-dirty #296
[ 23.477776] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[ 23.486286] pstate: a0c00009 (NzCv daif +PAN +UAO)
[ 23.491065] pc : ata_host_detach+0x15c/0x168
[ 23.495322] lr : ata_host_detach+0x88/0x168
[ 23.499491] sp : ffff800011cabb50
[ 23.502792] x29: ffff800011cabb50 x28: 0000000000000007
[ 23.508091] x27: ffff80001137f068 x26: ffff8000112c0c28
[ 23.513390] x25: 0000000000003848 x24: ffff0023ea185300
[ 23.518689] x23: 0000000000000001 x22: 00000000000014c0
[ 23.523987] x21: 0000000000013740 x20: ffff0023bdc20000
[ 23.529286] x19: 0000000000000000 x18: 0000000000000004
[ 23.534584] x17: 0000000000000001 x16: 00000000000000f0
[ 23.539883] x15: ffff0023eac13790 x14: ffff0023eb76c408
[ 23.545181] x13: 0000000000000000 x12: ffff0023eac13790
[ 23.550480] x11: ffff0023eb76c228 x10: 0000000000000000
[ 23.555779] x9 : ffff0023eac13798 x8 : 0000000040000000
[ 23.561077] x7 : 0000000000000002 x6 : 0000000000000001
[ 23.566376] x5 : 0000000000000002 x4 : 0000000000000000
[ 23.571674] x3 : ffff0023bf08a0bc x2 : 0000000000000000
[ 23.576972] x1 : 3099674201f72700 x0 : 0000000000400284
[ 23.582272] Call trace:
[ 23.584706] ata_host_detach+0x15c/0x168
[ 23.588616] ata_pci_remove_one+0x10/0x18
[ 23.592615] ahci_remove_one+0x20/0x40
[ 23.596356] pci_device_remove+0x3c/0xe0
[ 23.600267] really_probe+0xdc/0x3e0
[ 23.603830] driver_probe_device+0x58/0x100
[ 23.608000] device_driver_attach+0x6c/0x90
[ 23.612169] __driver_attach+0x84/0xc8
[ 23.615908] bus_for_each_dev+0x74/0xc8
[ 23.619730] driver_attach+0x20/0x28
[ 23.623292] bus_add_driver+0x148/0x1f0
[ 23.627115] driver_register+0x60/0x110
[ 23.630938] __pci_register_driver+0x40/0x48
[ 23.635199] ahci_pci_driver_init+0x20/0x28
[ 23.639372] do_one_initcall+0x5c/0x1b0
[ 23.643199] kernel_init_freeable+0x1a4/0x24c
[ 23.647546] kernel_init+0x10/0x108
[ 23.651023] ret_from_fork+0x10/0x18
[ 23.654590] ---[ end trace 634a14b675b71c13 ]---

I'll dig deeper if I get a chance.

John

Thanks,
John

Log snippet:

[ 137.529384][ T187]
==================================================================
[ 137.529403][ T187] BUG: KASAN: slab-out-of-bounds in
ata_link_next+0x5c/0x1b0
[ 137.529411][ T187] Read of size 4 at addr ffff002320073cc8 by task
kworker/u194:0/187
[ 137.529414][ T187]
[ 137.529426][ T187] CPU: 38 PID: 187 Comm: kworker/u194:0 Tainted: G
W 5.4.0-rc3+ #1149
[ 137.529433][ T187] Hardware name: Huawei D06 /D06, BIOS Hisilicon
D06 UEFI RC0 - V1.16.01 03/15/2019
[ 137.529447][ T187] Workqueue: events_unbound async_run_entry_fn
[ 137.529454][ T187] Call trace:
[ 137.529462][ T187] dump_backtrace+0x0/0x298
[ 137.529469][ T187] show_stack+0x20/0x30
[ 137.529477][ T187] dump_stack+0x190/0x21c
[ 137.529488][ T187] print_address_description.isra.6+0x80/0x3d0
[ 137.529496][ T187] __kasan_report+0x174/0x23c
[ 137.529503][ T187] kasan_report+0xc/0x18
[ 137.529511][ T187] __asan_load4+0xa4/0xb0
[ 137.529517][ T187] ata_link_next+0x5c/0x1b0
[ 137.529526][ T187] ata_scsi_scan_host+0x50/0x2d0
[ 137.529533][ T187] async_port_probe+0x7c/0xa8
[ 137.529541][ T187] async_run_entry_fn+0x118/0x340
[ 137.529549][ T187] process_one_work+0x7b8/0xdb8
[ 137.529556][ T187] worker_thread+0x414/0x6b8
[ 137.529562][ T187] kthread+0x1d4/0x1f0
[ 137.529570][ T187] ret_from_fork+0x10/0x18
[ 137.529574][ T187]
[ 137.529580][ T187] Allocated by task 16:
[ 137.529589][ T187] save_stack+0x28/0xb0
[ 137.529597][ T187] __kasan_kmalloc.isra.9+0xa0/0xc8
[ 137.529604][ T187] kasan_slab_alloc+0x14/0x20
[ 137.529612][ T187] kmem_cache_alloc+0x25c/0x420
[ 137.529620][ T187] __proc_create+0x430/0x458
[ 137.529627][ T187] proc_mkdir_data+0x50/0xf0
[ 137.529634][ T187] proc_mkdir+0x2c/0x38
[ 137.529642][ T187] register_irq_proc+0x104/0x278
[ 137.529650][ T187] __setup_irq+0xafc/0xe80
[ 137.529657][ T187] request_threaded_irq+0x198/0x218
[ 137.529666][ T187] devm_request_threaded_irq+0xb0/0x120
[ 137.529674][ T187] ahci_host_activate+0x1cc/0x290
[ 137.529682][ T187] ahci_init_one+0x1358/0x1570
[ 137.529691][ T187] local_pci_probe+0x78/0xf0
[ 137.529698][ T187] work_for_cpu_fn+0x30/0x50
[ 137.529705][ T187] process_one_work+0x7b8/0xdb8
[ 137.529713][ T187] worker_thread+0x4a8/0x6b8
[ 137.529719][ T187] kthread+0x1d4/0x1f0
[ 137.529726][ T187] ret_from_fork+0x10/0x18
[ 137.529729][ T187]
[ 137.529734][ T187] Freed by task 0:
[ 137.529737][ T187] (stack is not available)
[ 137.529741][ T187]
[ 137.529747][ T187] The buggy address belongs to the object at
ffff002320073b68
[ 137.529747][ T187] which belongs to the cache proc_dir_entry of
size 256
[ 137.529755][ T187] The buggy address is located 96 bytes to the
right of
[ 137.529755][ T187] 256-byte region [ffff002320073b68,
ffff002320073c68)
[ 137.529759][ T187] The buggy address belongs to the page:
[ 137.529768][ T187] page:fffffe008c601c00 refcount:1 mapcount:0
mapping:ffff0020bf1b8480 index:0xffff002320070988 compound_mapcount: 0
[ 137.529779][ T187] flags: 0x1ffff00000010200(slab|head)
[ 137.529791][ T187] raw: 1ffff00000010200 ffff0020bf1a3cd0
fffffe008c612208 ffff0020bf1b8480
[ 137.529801][ T187] raw: ffff002320070988 000000000035000b
00000001ffffffff 0000000000000000
[ 137.529805][ T187] page dumped because: kasan: bad access detected
[ 137.529809][ T187]
[ 137.529812][ T187] Memory state around the buggy address:
[ 137.529820][ T187] ffff002320073b80: 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00
[ 137.529827][ T187] ffff002320073c00: 00 00 00 00 00 00 00 00 00 00
00 00 00 fc fc fc
[ 137.529835][ T187] >ffff002320073c80: fc fc fc fc fc fc fc fc fc fc
fc fc fc fc fc fc
[ 137.529839][ T187] ^
[ 137.529846][ T187] ffff002320073d00: fc fc fc fc fc fc fc fc fc fc
fc fc fc fc fc fc
[ 137.529853][ T187] ffff002320073d80: fc fc fc fc fc fc fc fc fc fc
fc fc fc fc fc fc
[ 137.529857][ T187]
==================================================================
[ 137.529874][ T187] Unable to handle kernel paging request at virtual
address 000004a02a0040ba
[ 137.529883][ T187] Mem abort info:
[ 137.529893][ T187] ESR = 0x96000004
[ 137.529905][ T187] EC = 0x25: DABT (current EL), IL = 32 bits
[ 137.529915][ T187] SET = 0, FnV = 0
[ 137.529925][ T187] EA = 0, S1PTW = 0
[ 137.529934][ T187] Data abort info:
[ 137.529940][ T16] kobject: '177' ((____ptrval____)):
kobject_add_internal: parent: 'irq', set: '<NULL>'
[ 137.529945][ T187] ISV = 0, ISS = 0x00000004
[ 137.529955][ T187] CM = 0, WnR = 0
[ 137.529966][ T187] [000004a02a0040ba] user address but active_mm is
swapper
[ 137.529979][ T187] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 137.529985][ T187] Modules linked in:
[ 137.529996][ T187] CPU: 38 PID: 187 Comm: kworker/u194:0 Tainted: G
B W 5.4.0-rc3+ #1149
[ 137.530002][ T187] Hardware name: Huawei D06 /D06, BIOS Hisilicon
D06 UEFI RC0 - V1.16.01 03/15/2019
[ 137.530011][ T187] Workqueue: events_unbound async_run_entry_fn
[ 137.530020][ T187] pstate: 40800009 (nZcv daif -PAN +UAO)
[ 137.530026][ T187] pc : ata_dev_next+0xf8/0x1f8
[ 137.530033][ T187] lr : ata_dev_next+0xf8/0x1f8
[ 137.530037][ T187] sp : ffff0020bddafb80
[ 137.530042][ T187] x29: ffff0020bddafb80 x28: 0000000000000000
[ 137.530050][ T187] x27: 0000000000003cc0 x26: ffffa00015016000
[ 137.530059][ T187] x25: 00000000000020c0 x24: ffffa000113102a8
[ 137.530068][ T187] x23: 350004a02a0003fa x22: ffffa00011335db0
[ 137.530077][ T187] x21: ffffa000113356f0 x20: 0000000000000000
[ 137.530085][ T187] x19: ffffa00011336830 x18: 00000000000025a8
[ 137.530094][ T187] x17: 00000000000025a0 x16: 00000000000026b0
[ 137.530102][ T187] x15: 0000000000001470 x14: 3d3d3d3d3d3d3d3d
[ 137.530111][ T187] x13: 3d3d3d3d3d3d3d3d x12: 1fffe00417bb5f3e
[ 137.530120][ T187] x11: ffff800417bb5f3e x10: dfffa00000000000
[ 137.530129][ T187] x9 : ffff800417bb5f3f x8 : 0000000000000001
[ 137.530138][ T187] x7 : 0000000000003cd0 x6 : 0000000000000000
[ 137.530146][ T187] x5 : ffffa00015015cc8 x4 : 0000000000000000
[ 137.530155][ T187] x3 : ffffa00011301608 x2 : 0000000000000000
[ 137.530163][ T187] x1 : 21f045c4027f9500 x0 : 0000000000000000
[ 137.530171][ T187] Call trace:
[ 137.530178][ T187] ata_dev_next+0xf8/0x1f8
[ 137.530187][ T187] ata_scsi_scan_host+0x6c/0x2d0
[ 137.530194][ T187] async_port_probe+0x7c/0xa8
[ 137.530202][ T187] async_run_entry_fn+0x118/0x340
[ 137.530209][ T187] process_one_work+0x7b8/0xdb8
[ 137.530216][ T187] worker_thread+0x414/0x6b8
[ 137.530223][ T187] kthread+0x1d4/0x1f0
[ 137.530230][ T187] ret_from_fork+0x10/0x18
[ 137.530242][ T187] Code: 97c544f6 d2879801 8b0102e0 97cd4fb5 (f95e62e0)
[ 137.530472][ T187] ---[ end trace 1702441cf7c2cd89 ]---
[ 137.530479][ T187] Kernel panic - not syncing: Fatal exception
[ 137.530592][ T16] kobject: '178' ((____ptrval____)):
kobject_add_internal: parent: 'irq', set: '<NULL>'
[ 137.530653][ T187] SMP: stopping secondary CPUs
[ 137.531078][ T187] Kernel Offset: disabled
[ 137.531085][ T187] CPU features: 0x0002,23208a38
[ 137.531089][ T187] Memory Limit: none
[ 143.699079][ T187] ---[ end Kernel panic - not syncing: Fatal
exception ]---