Re: 6.12/BUG: KASAN: slab-use-after-free in m_next at fs/proc/task_mmu.c:187

From: Lorenzo Stoakes
Date: Wed Oct 02 2024 - 13:56:29 EST


Thanks for your report!

On Wed, Oct 02, 2024 at 10:34:32PM GMT, Mikhail Gavrilov wrote:
> On Wed, Sep 25, 2024 at 3:28 AM Mikhail Gavrilov
> <mikhail.v.gavrilov@xxxxxxxxx> wrote:
> >
> > Hi,
> > I am testing kernel snapshots on Fedora Rawhide and Today with build
> > on commit de5cb0dcb74c I saw for the first time "KASAN:
> > slab-use-after-free in m_next+0x13b".
> > Unfortunately it is not clear what triggered this problem because it
> > happened after 21 hour uptime.
> >
> > Full trace looks like:
> > input: Noble FoKus Mystique (AVRCP) as /devices/virtual/input/input26
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in m_next+0x13b/0x170
> > Read of size 8 at addr ffff8885609b40f0 by task htop/3847
> >
> > CPU: 14 UID: 1000 PID: 3847 Comm: htop Tainted: G W L
> > ------- --- 6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64+debug
> > #1
> > Tainted: [W]=WARN, [L]=SOFTLOCKUP
> > Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI,
> > BIOS 3040 09/12/2024
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x84/0xd0
> > ? m_next+0x13b/0x170
> > print_report+0x174/0x505
> > ? m_next+0x13b/0x170
> > ? __virt_addr_valid+0x231/0x420
> > ? m_next+0x13b/0x170
> > kasan_report+0xab/0x180
> > ? m_next+0x13b/0x170
> > m_next+0x13b/0x170
> > seq_read_iter+0x8e5/0x1130
> > seq_read+0x2b4/0x3c0
> > ? __pfx_seq_read+0x10/0x10
> > ? inode_security+0x54/0xf0
> > ? rw_verify_area+0x3b2/0x5e0
> > vfs_read+0x165/0xa20
> > ? __pfx_vfs_read+0x10/0x10
> > ? ktime_get_coarse_real_ts64+0x41/0xd0
> > ? local_clock_noinstr+0xd/0x100
> > ? __pfx_lock_release+0x10/0x10
> > ksys_read+0xfb/0x1d0
> > ? __pfx_ksys_read+0x10/0x10
> > ? ktime_get_coarse_real_ts64+0x41/0xd0
> > do_syscall_64+0x97/0x190
> > ? __lock_acquire+0xdcd/0x62c0
> > ? __pfx___lock_acquire+0x10/0x10
> > ? __pfx___lock_acquire+0x10/0x10
> > ? __pfx___lock_acquire+0x10/0x10
> > ? audit_filter_inodes.part.0+0x12d/0x220
> > ? local_clock_noinstr+0xd/0x100
> > ? __pfx_lock_release+0x10/0x10
> > ? rcu_is_watching+0x12/0xc0
> > ? kfree+0x27c/0x4d0
> > ? audit_reset_context+0x8c5/0xee0
> > ? lockdep_hardirqs_on_prepare+0x171/0x400
> > ? do_syscall_64+0xa3/0x190
> > ? lockdep_hardirqs_on+0x7c/0x100
> > ? do_syscall_64+0xa3/0x190
> > ? do_syscall_64+0xa3/0x190
> > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > RIP: 0033:0x7f4190dcac36
> > Code: 89 df e8 2d c1 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 75 15
> > 83 e2 39 83 fa 08 75 0d e8 32 ff ff ff 66 90 48 8b 45 10 0f 05 <48> 8b
> > 5d f8 c9 c3 0f 1f 40 00 f3 0f 1e fa 55 48 89 e5 48 83 ec 08
> > RSP: 002b:00007ffcde82b690 EFLAGS: 00000202 ORIG_RAX: 0000000000000000
> > RAX: ffffffffffffffda RBX: 00007f4190ce3740 RCX: 00007f4190dcac36
> > RDX: 0000000000000400 RSI: 000055bf5e823a20 RDI: 0000000000000005
> > RBP: 00007ffcde82b6a0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000202 R12: 00007f4190f44fd0
> > R13: 00007f4190f44e80 R14: 000055bf5e823e20 R15: 000055bf5ecc9160
> > </TASK>
> >
> > Allocated by task 176289:
> > kasan_save_stack+0x30/0x50
> > kasan_save_track+0x14/0x30
> > __kasan_slab_alloc+0x6e/0x70
> > kmem_cache_alloc_noprof+0x15a/0x3d0
> > vm_area_dup+0x23/0x190
> > __split_vma+0x137/0xd40
> > vms_gather_munmap_vmas+0x29d/0xfc0
> > mmap_region+0x35a/0x1f50
> > do_mmap+0x8e7/0x1020
> > vm_mmap_pgoff+0x178/0x2f0
> > __do_fast_syscall_32+0x86/0x110
> > do_fast_syscall_32+0x32/0x80
> > sysret32_from_system_call+0x0/0x4a
> >
> > Freed by task 0:
> > kasan_save_stack+0x30/0x50
> > kasan_save_track+0x14/0x30
> > kasan_save_free_info+0x3b/0x70
> > __kasan_slab_free+0x37/0x50
> > kmem_cache_free+0x1a7/0x5a0
> > rcu_do_batch+0x3fd/0x1120
> > rcu_core+0x636/0x9b0
> > handle_softirqs+0x1e9/0x8d0
> > __irq_exit_rcu+0xbb/0x1c0
> > irq_exit_rcu+0xe/0x30
> > sysvec_apic_timer_interrupt+0xa1/0xd0
> > asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >
> > Last potentially related work creation:
> > kasan_save_stack+0x30/0x50
> > __kasan_record_aux_stack+0x8e/0xa0
> > __call_rcu_common.constprop.0+0xf4/0x10d0
> > vma_complete+0x720/0x10b0
> > commit_merge+0x42a/0x1310
> > vma_expand+0x313/0xad0
> > vma_merge_new_range+0x2cd/0xec0
> > mmap_region+0x432/0x1f50
> > do_mmap+0x8e7/0x1020
> > vm_mmap_pgoff+0x178/0x2f0
> > __do_fast_syscall_32+0x86/0x110
> > do_fast_syscall_32+0x32/0x80
> > sysret32_from_system_call+0x0/0x4a
> >
> > The buggy address belongs to the object at ffff8885609b40f0
> > which belongs to the cache vm_area_struct of size 176
> > The buggy address is located 0 bytes inside of
> > freed 176-byte region [ffff8885609b40f0, ffff8885609b41a0)
> >
> > The buggy address belongs to the physical page:
> > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5609b4
> > head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > memcg:ffff88814d36d001
> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> > page_type: f5(slab)
> > raw: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122
> > raw: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001
> > head: 0017ffffc0000040 ffff888108113d40 dead000000000100 dead000000000122
> > head: 0000000000000000 0000000000220022 00000001f5000000 ffff88814d36d001
> > head: 0017ffffc0000001 ffffea0015826d01 ffffffffffffffff 0000000000000000
> > head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> >
> > Memory state around the buggy address:
> > ffff8885609b3f80: 00 00 00 00 00 00 00 00 00 00 00 00task_mmu 00 00 00 00
> > ffff8885609b4000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >ffff8885609b4080: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fa fb
> > ^
> > ffff8885609b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ffff8885609b4180: fb fb fb fb fc fc fc fc fc fc fc fc 00 00 00 00
> > ==================================================================
> > Disabling lock debugging due to kernel taint
> >
> > > sh /usr/src/kernels/(uname -r)/scripts/faddr2line /lib/debug/lib/modules/(uname -r)/vmlinux m_next+0x13b
> > m_next+0x13b/0x170:
> > proc_get_vma at fs/proc/task_mmu.c:136
> > (inlined by) m_next at fs/proc/task_mmu.c:187
> >
> > > cat -n /usr/src/debug/kernel-6.11-8833-gde5cb0dcb74c/linux-6.12.0-0.rc0.20240923gitde5cb0dcb74c.9.fc42.x86_64/fs/proc/task_mmu.c | sed -n '182,192 p'
> > 182 {
> > 183 if (*ppos == -2UL) {
> > 184 *ppos = -1UL;
> > 185 return NULL;
> > 186 }
> > 187 return proc_get_vma(m->private, ppos);
> > 188 }
> > 189
> > 190 static void m_stop(struct seq_file *m, void *v)
> > 191 {
> > 192 struct proc_maps_private *priv = m->private;
> >
> > > git blame fs/proc/task_mmu.c -L 182,192
> > Blaming lines: 100% (11/11), done.
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 182) {
> > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 183)
> > if (*ppos == -2UL) {
> > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 184)
> > *ppos = -1UL;
> > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 185)
> > return NULL;
> > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 186) }
> > c4c84f06285e4 (Matthew Wilcox (Oracle) 2022-09-06 19:48:57 +0000 187)
> > return proc_get_vma(m->private, ppos);
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 188) }
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 189)
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 190)
> > static void m_stop(struct seq_file *m, void *v)
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 191) {
> > a6198797cc3fd (Matt Mackall 2008-02-04 22:29:03 -0800 192)
> > struct proc_maps_private *priv = m->private;
> >
> > Hmm this line hasn't changed for two years.
> >
> > Machine spec: https://linux-hardware.org/?probe=323b76ce48
> > I attached below full kernel log and build config.
> >
> > Can anyone figure out what happened or should we wait for the second
> > manifestation of this issue?
> >
>
> Finally I spotted that this issue is caused by the Steam client.
> And usually happens after downloading game updates.
> Looks like Steam client runs some post update scripts which cause
> slab-use-after-free in m_next.

Yeah similar issue being investigated elsewhere,

See
https://lore.kernel.org/all/c63a64a9-cdee-4586-85ba-800e8e1a8054@lucifer.local/
for latest update.

This is ongoing, but also steam, also this commit and also related to steam
update doing something strange, so strange I literally can't repro locally :)
but Bert in that thread can.

We can reliably repro it with CONFIG_DEBUG_VM_MAPLE_TREE, CONFIG_DEBUG_VM, and
CONFIG_DEBUG_MAPLE_TREE set, if you set these you should see a report more
quickly (let us know if you do).


Also note that there is a critical error handling fix in

https://lore.kernel.org/linux-mm/20241002073932.13482-1-lorenzo.stoakes@xxxxxxxxxx/

Which should get hotfixed soon.



>
> Git bisect found the first bad commit:
> commit f8d112a4e657c65c888e6b8a8435ef61a66e4ab8 (HEAD)
> Author: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
> Date: Fri Aug 30 00:00:54 2024 -0400
>
> mm/mmap: avoid zeroing vma tree in mmap_region()
>
> Instead of zeroing the vma tree and then overwriting the area, let the
> area be overwritten and then clean up the gathered vmas using
> vms_complete_munmap_vmas().
>
> To ensure locking is downgraded correctly, the mm is set regardless of
> MAP_FIXED or not (NULL vma).
>
> If a driver is mapping over an existing vma, then clear the ptes before
> the call_mmap() invocation. This is done using the vms_clean_up_area()
> helper. If there is a close vm_ops, that must also be called to ensure
> any cleanup is done before mapping over the area. This also means that
> calling open has been added to the abort of an unmap operation, for now.
>
> Since vm_ops->open() and vm_ops->close() are not always undo each other
> (state cleanup may exist in ->close() that is lost forever), the code
> cannot be left in this way, but that change has been isolated to another
> commit to make this point very obvious for traceability.
>
> Temporarily keep track of the number of pages that will be removed and
> reduce the charged amount.
>
> This also drops the validate_mm() call in the vma_expand() function. It
> is necessary to drop the validate as it would fail since the mm map_count
> would be incorrect during a vma expansion, prior to the cleanup from
> vms_complete_munmap_vmas().
>
> Clean up the error handing of the vms_gather_munmap_vmas() by calling the
> verification within the function.
>
> Link: https://lkml.kernel.org/r/20240830040101.822209-15-Liam.Howlett@xxxxxxxxxx
> Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> Cc: Bert Karwatzki <spasswolf@xxxxxx>
> Cc: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> Cc: Jiri Olsa <olsajiri@xxxxxxxxx>
> Cc: Kees Cook <kees@xxxxxxxxxx>
> Cc: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
> Cc: Mark Brown <broonie@xxxxxxxxxx>
> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> Cc: Paul Moore <paul@xxxxxxxxxxxxxx>
> Cc: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx>
> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>
> mm/mmap.c | 57 +++++++++++++++++++++++++++------------------------------
> mm/vma.c | 54 ++++++++++++++++++++++++++++++++++++++++++------------
> mm/vma.h | 22 ++++++++++++++++------
> 3 files changed, 85 insertions(+), 48 deletions(-)
>
> --
> Best Regards,
> Mike Gavrilov.