Re: [PATCH v3 0/7] kmalloc-reclaimable caches

From: Roman Gushchin
Date: Thu Jul 19 2018 - 15:54:25 EST


On Wed, Jul 18, 2018 at 03:36:13PM +0200, Vlastimil Babka wrote:
> v3 changes:
> - fix missing hunk in patch 5/7
> - more verbose cover letter and patch 6/7 commit log
> v2 changes:
> - shorten cache names to kmalloc-rcl-<SIZE>
> - last patch shortens <SIZE> for all kmalloc caches to e.g. "1k", "4M"
> - include dma caches to the 2D kmalloc_caches[] array to avoid a branch
> - vmstat counter nr_indirectly_reclaimable_bytes renamed to
> nr_kernel_misc_reclaimable, doesn't include kmalloc-rcl-*
> - /proc/meminfo counter renamed to KReclaimable, includes kmalloc-rcl*
> and nr_kernel_misc_reclaimable
>
> Hi,
>
> as discussed at LSF/MM [1] here's a patchset that introduces
> kmalloc-reclaimable caches (more details in the second patch) and uses them for
> SLAB freelists and dcache external names. The latter allows us to repurpose the
> NR_INDIRECTLY_RECLAIMABLE_BYTES counter later in the series.
>
> With patch 4/7, dcache external names are allocated from kmalloc-rcl-*
> caches, eliminating the need for manual accounting. More importantly, it
> also ensures the reclaimable kmalloc allocations are grouped in pages
> separate from the regular kmalloc allocations. The need for proper
> accounting of dcache external names has shown it's easy for misbehaving
> process to allocate lots of them, causing premature OOMs. Without the
> added grouping, it's likely that a similar workload can interleave the
> dcache external names allocations with regular kmalloc allocations
> (note: I haven't searched myself for an example of such regular kmalloc
> allocation, but I would be very surprised if there wasn't some). A
> pathological case would be e.g. one 64byte regular allocations with 63
> external dcache names in a page (64x64=4096), which means the page is
> not freed even after reclaiming after all dcache names, and the process
> can thus "steal" the whole page with single 64byte allocation.
>
> If there other kmalloc users similar to dcache external names become
> identified, they can also benefit from the new functionality simply by
> adding __GFP_RECLAIMABLE to the kmalloc calls.
>
> Side benefits of the patchset (that could be also merged separately)
> include removed branch for detecting __GFP_DMA kmalloc(), and shortening
> kmalloc cache names in /proc/slabinfo output. The latter is potentially
> an ABI break in case there are tools parsing the names and expecting the
> values to be in bytes.
>
> This is how /proc/slabinfo looks like after booting in virtme:
>
> ...
> kmalloc-rcl-4M 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0
> ...
> kmalloc-rcl-96 7 32 128 32 1 : tunables 120 60 8 : slabdata 1 1 0
> kmalloc-rcl-64 25 128 64 64 1 : tunables 120 60 8 : slabdata 2 2 0
> kmalloc-rcl-32 0 0 32 124 1 : tunables 120 60 8 : slabdata 0 0 0
> kmalloc-4M 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0
> kmalloc-2M 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0
> kmalloc-1M 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0
> ...
>
> /proc/vmstat with renamed nr_indirectly_reclaimable_bytes counter:
>
> ...
> nr_slab_reclaimable 2817
> nr_slab_unreclaimable 1781
> ...
> nr_kernel_misc_reclaimable 0
> ...
>
> /proc/meminfo with new KReclaimable counter:
>
> ...
> Shmem: 564 kB
> KReclaimable: 11260 kB
> Slab: 18368 kB
> SReclaimable: 11260 kB
> SUnreclaim: 7108 kB
> KernelStack: 1248 kB
> ...
>
> Thanks,
> Vlastimil

Hi, Vlastimil!

Overall the patchset looks solid to me.
Please, feel free to add
Acked-by: Roman Gushchin <guro@xxxxxx>

Two small nits:
1) The last patch is unrelated to the main idea,
and can potentially cause ABI breakage.
I'd separate it from the rest of the patchset.

2) It's actually re-opening the security issue for SLOB
users. Is the memory overhead really big enough to
justify that?

Thanks!