Re: [PATCH v1] mm/vmalloc: fix exact allocations with an alignment > 1

From: David Hildenbrand
Date: Wed Sep 29 2021 - 11:05:15 EST


On 29.09.21 16:49, Uladzislau Rezki wrote:
On Wed, Sep 29, 2021 at 4:40 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 29.09.21 16:30, Uladzislau Rezki wrote:

So the idea is that once we run into a dead end because we took a left
subtree, we rollback to the next possible rigth subtree and try again.
If we run into another dead end, we repeat ... thus, this can now happen
more than once.

I assume the only implication is that this can now be slower in some
corner cases with larger alignment, because it might take longer to find
something suitable. Fair enough.

Yep, your understanding is correct regarding the tree traversal. If no
suitable block
is found in left sub-tree we roll-back and check right one. So it can
be(the scanning)
more than one time.

I did some performance analyzing using vmalloc test suite to figure
out a performance
loss for allocations with specific alignment. On that syntactic test i
see approx. 30%
of degradation:

How realistic is that test case? I assume most alignment we're dealing
with is:
* 1/PAGE_SIZE
* huge page size (for automatic huge page placing)

Well that is synthetic test. Most of the alignments are 1 or PAGE_SIZE.
There are users which use internal API where you can specify an alignment
you want but those are mainly like KASAN, module alloc, etc.


2.225 microseconds vs 1.496 microseconds. That time includes both
vmalloc() and vfree()
calls. I do not consider it as a big degrade, but from the other hand
we can still adjust the
search length for alignments > one page:

# add it on top of previous proposal and search length instead of size
length = align > PAGE_SIZE ? size + align:size;

That will not allow to place huge pages in the case of kasan. And I
consider that more important than optimizing a syntactic test :) My 2 cents.

Could you please to be more specific? I mean how is it connected with huge
pages mappings? Huge-pages are which have order > 0. Or you mean that
a special alignments are needed for mapping huge pages?

Let me try to clarify:


KASAN does an exact allocation when onlining a memory block, __vmalloc_node_range() will try placing huge pages first, increasing the alignment to e.g., "1 << PMD_SHIFT".

If we increase the search length in find_vmap_lowest_match(), that search will fail if the exact allocation is surrounded by other allocations. In that case, we won't place a huge page although we could -- because find_vmap_lowest_match() would be imprecise for alignments > PAGE_SIZE.


Memory blocks we online/offline on x86 are at least 128MB. The KASAN "overhead" we have to allocate is 1/8 of that -- 16 MB, so essentially 8 huge pages.

__vmalloc_node_range() will increase the alignment to 2MB to try placing huge pages first. find_vmap_lowest_match() will search within the given exact 16MB are a 18MB area (size + align), which won't work. So __vmalloc_node_range() will fallback to the original PAGE_SIZE alignment and shift=PAGE_SHIFT.

__vmalloc_area_node() will set the set_vm_area_page_order effectively to 0 -- small pages.

Does that make sense or am I missing something?

--
Thanks,

David / dhildenb