Re: [PATCH v2 2/2] mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)

From: Vlastimil Babka
Date: Thu Aug 29 2019 - 03:56:20 EST


On 8/29/19 12:24 AM, Dave Chinner wrote:
> On Wed, Aug 28, 2019 at 12:46:08PM -0700, Matthew Wilcox wrote:
>> On Wed, Aug 28, 2019 at 06:45:07PM +0000, Christopher Lameter wrote:
>>> I still think implicit exceptions to alignments are a bad idea. Those need
>>> to be explicity specified and that is possible using kmem_cache_create().
>>
>> I swear we covered this last time the topic came up, but XFS would need
>> to create special slab caches for each size between 512 and PAGE_SIZE.
>> Potentially larger, depending on whether the MM developers are willing to
>> guarantee that kmalloc(PAGE_SIZE * 2, GFP_KERNEL) will return a PAGE_SIZE
>> aligned block of memory indefinitely.
>
> Page size alignment of multi-page heap allocations is ncessary. The
> current behaviour w/ KASAN is to offset so a 8KB allocation spans 3
> pages and is not page aligned. That causes just as much in way
> of alignment problems as unaligned objects in multi-object-per-page
> slabs.

Ugh, multi-page (power of two) allocations *at the page allocator level*
simply have to be aligned, as that's how the buddy allocator has always
worked, and it would be madness to try relax that guarantee and require
an explicit flag at this point. The kmalloc wrapper with SLUB will pass
everything above 8KB directly to the page allocator, so that's fine too.
4k and 8k are the only (multi-)page sizes still managed as SLUB objects.
I would say that these sizes are the most striking example that it's
wrong not to align them without extra flags or special API variant.

> As I said in the lastest discussion of this problem on XFS (pmem
> devices w/ KASAN enabled), all we -need- is a GFP flag that tells the
> slab allocator to give us naturally aligned object or fail if it
> can't. I don't care how that gets implemented (e.g. another set of
> heap slabs like the -rcl slabs), I just don't want every high level

Given alignment is orthogonal to -rcl and dma-, would that be another
three sets? Or we assume that dma- would want it always, and complicate
the rules further? Funilly enough, SLOB would be the simplest case here.

> subsystem that allocates heap memory for IO buffers to have to
> implement their own aligned slab caches.

Definitely agree. I still hope we can do that even without a new flag/API.

> Cheers,
>
> Dave.
>