Re: [PATCH v6 05/37] fs: Convert alloc_inode_sb() to a macro
From: Suren Baghdasaryan
Date: Thu Apr 04 2024 - 12:58:25 EST
On Thu, Mar 21, 2024 at 3:47 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Thu, Mar 21, 2024 at 3:17 PM Kent Overstreet
> <kent.overstreet@xxxxxxxxx> wrote:
> >
> > On Thu, Mar 21, 2024 at 03:09:08PM -0700, Andrew Morton wrote:
> > > On Thu, 21 Mar 2024 17:15:39 -0400 Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
> > >
> > > > On Thu, Mar 21, 2024 at 01:31:47PM -0700, Andrew Morton wrote:
> > > > > On Thu, 21 Mar 2024 09:36:27 -0700 Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > > > >
> > > > > > From: Kent Overstreet <kent.overstreet@xxxxxxxxx>
> > > > > >
> > > > > > We're introducing alloc tagging, which tracks memory allocations by
> > > > > > callsite. Converting alloc_inode_sb() to a macro means allocations will
> > > > > > be tracked by its caller, which is a bit more useful.
> > > > >
> > > > > I'd have thought that there would be many similar
> > > > > inlines-which-allocate-memory. Such as, I dunno, jbd2_alloc_inode().
> > > > > Do we have to go converting things to macros as people report
> > > > > misleading or less useful results, or is there some more general
> > > > > solution to this?
> > > >
> > > > No, this is just what we have to do.
> > >
> > > Well, this is something we strike in other contexts - kallsyms gives us
> > > an inlined function and it's rarely what we wanted.
> > >
> > > I think kallsyms has all the data which is needed to fix this - how
> > > hard can it be to figure out that a particular function address lies
> > > within an outer function? I haven't looked...
> >
> > This is different, though - even if a function is inlined in multiple
> > places there's only going to be one instance of a static var defined
> > within that function.
>
> I guess one simple way to detect the majority of these helpers would
> be to filter all entries from /proc/allocinfo which originate from
> header files.
>
> ~# grep ".*\.h:." /proc/allocinfo
> 933888 228 include/linux/mm.h:2863 func:pagetable_alloc
> 848 53 include/linux/mm_types.h:1175 func:mm_alloc_cid
> 0 0 include/linux/bpfptr.h:70 func:kvmemdup_bpfptr
> 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node
> 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu
> 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu
> 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node
> 0 0 include/linux/bpf.h:2249 func:bpf_map_kvcalloc
> 0 0 include/linux/bpf.h:2243 func:bpf_map_kzalloc
> 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node
> 0 0 include/linux/ptr_ring.h:471
> func:__ptr_ring_init_queue_alloc
> 0 0 include/linux/bpf.h:2256 func:bpf_map_alloc_percpu
> 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node
> 0 0 include/net/tcx.h:80 func:tcx_entry_create
> 0 0 arch/x86/include/asm/pgalloc.h:156 func:p4d_alloc_one
> 487424 119 include/linux/mm.h:2863 func:pagetable_alloc
> 0 0 include/linux/mm.h:2863 func:pagetable_alloc
> 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode
> 0 0 include/linux/jbd2.h:1591 func:jbd2_alloc_handle
> 0 0 fs/nfs/iostat.h:51 func:nfs_alloc_iostats
> 0 0 include/net/netlabel.h:281 func:netlbl_secattr_cache_alloc
> 0 0 include/net/netlabel.h:381 func:netlbl_secattr_alloc
> 0 0 include/crypto/internal/acompress.h:76
> func:__acomp_request_alloc
> 8064 84 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 1016 74 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 384 4 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 704 3 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 32 1 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 64 1 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 40 2 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 32 1 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 30000 625 include/acpi/platform/aclinuxex.h:67
> func:acpi_os_acquire_object
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:67
> func:acpi_os_acquire_object
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 512 1 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 192 6 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 192 3 include/acpi/platform/aclinuxex.h:52 func:acpi_os_allocate
> 61992 861 include/acpi/platform/aclinuxex.h:67
> func:acpi_os_acquire_object
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 include/acpi/platform/aclinuxex.h:67
> func:acpi_os_acquire_object
> 0 0 include/acpi/platform/aclinuxex.h:57
> func:acpi_os_allocate_zeroed
> 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page
> 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page
> 0 0 drivers/iommu/amd/amd_iommu.h:141 func:alloc_pgtable_page
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/hid_bpf.h:154 func:call_hid_bpf_rdesc_fixup
> 0 0 include/linux/skbuff.h:3392 func:__dev_alloc_pages
> 114688 56 include/linux/ptr_ring.h:471
> func:__ptr_ring_init_queue_alloc
> 0 0 include/linux/skmsg.h:415 func:sk_psock_init_link
> 0 0 include/linux/bpf.h:2237 func:bpf_map_kmalloc_node
> 0 0 include/linux/ptr_ring.h:628 func:ptr_ring_resize_multiple
> 24576 3 include/linux/ptr_ring.h:471
> func:__ptr_ring_init_queue_alloc
> 0 0 include/net/netlink.h:1896 func:nla_memdup
> 0 0 include/linux/sockptr.h:97 func:memdup_sockptr
> 0 0 include/net/request_sock.h:131 func:reqsk_alloc
> 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options
> 0 0 include/net/tcp.h:2456 func:tcp_v4_save_options
> 0 0 include/crypto/hash.h:586 func:ahash_request_alloc
> 0 0 include/linux/sockptr.h:97 func:memdup_sockptr
> 0 0 include/linux/sockptr.h:97 func:memdup_sockptr
> 0 0 net/sunrpc/auth_gss/auth_gss_internal.h:38
> func:simple_get_netobj
> 0 0 include/crypto/hash.h:586 func:ahash_request_alloc
> 0 0 include/net/netlink.h:1896 func:nla_memdup
> 0 0 include/crypto/skcipher.h:869 func:skcipher_request_alloc
> 0 0 include/net/fq_impl.h:361 func:fq_init
> 0 0 include/net/netlabel.h:316 func:netlbl_catmap_alloc
>
> and it finds our example:
>
> 832 13 include/linux/jbd2.h:1607 func:jbd2_alloc_inode
>
> Interestingly the inlined functions which are called from multiple
> places will have multiple entries with the same file+line:
>
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
> 0 0 include/linux/dma-fence-chain.h:91
> func:dma_fence_chain_alloc
>
> So, duplicate entries can be also used as an indication of an inlined allocator.
> I'll go chase these down and will post a separate patch converting them.
I just posted https://lore.kernel.org/all/20240404165404.3805498-1-surenb@xxxxxxxxxx/
to report allocations done from the inlined functions in the headers
to their callers.