Re: [PATCH v1] mm/vmalloc: fix exact allocations with an alignment > 1

From: Uladzislau Rezki
Date: Tue Sep 21 2021 - 18:13:44 EST


On Fri, Sep 17, 2021 at 10:47:54AM +0200, David Hildenbrand wrote:
> > > > This patch looks like a KASAN specific to me. So i would like to avoid
> > > > such changes to
> > > > the vmalloc internals in order to simplify further maintenance and
> > > > keeping it generic
> > > > instead.
> > >
> > > There is nothing KASAN specific in there :) It's specific to exact
> > > applications -- and KASAN may be one of the only users for now.
> > >
> > Well, you can name it either way :) So it should not be specific by the
> > design, otherwise as i mentioned the allocator would be like a peace of
> > code that handles corner cases what is actually not acceptable.
>
> Let's not overstress the situation of adding essentially 3 LOC in order to
> fix a sane use case of the vmalloc allocator that was not considered
> properly by internal changes due to 68ad4a330433 ("mm/vmalloc.c: keep track
> of free blocks for vmap allocation").
>
> >
> > > >
> > > > Currently the find_vmap_lowest_match() adjusts the search size
> > > > criteria for any alignment
> > > > values even for PAGE_SIZE alignment. That is not optimal. Because a
> > > > PAGE_SIZE or one
> > > > page is a minimum granularity we operate on vmalloc allocations thus
> > > > all free blocks are
> > > > page aligned.
> > > >
> > > > All that means that we should adjust the search length only if an
> > > > alignment request is bigger than
> > > > one page, in case of one page, that corresponds to PAGE_SIZE value,
> > > > there is no reason in such
> > > > extra adjustment because all free->va_start have a page boundary anyway.
> > > >
> > > > Could you please test below patch that is a generic improvement?
> > >
> > > I played with the exact approach below (almost exactly the same code in
> > > find_vmap_lowest_match()), and it might make sense as a general improvement
> > > -- if we can guarantee that start+end of ranges are always PAGE-aligned; I
> > > was not able to come to that conclusion...
> > All free blocks are PAGE aligned that is how it has to be. A vstart also
> > should be aligned otherwise the __vunmap() will complain about freeing
> > a bad address:
> >
> > <snip>
> > if (WARN(!PAGE_ALIGNED(addr), "Trying to vfree() bad address (%p)\n",
> > addr))
> > return;
> > <snip>
> >
> > BTW, we should add an extra check to the alloc_vmap_area(), something like
> > below:
> >
> > <snip>
> > if (!PAGE_ALIGNED(ALIGN(vstart, align))) {
> > WARN_ONCE(1, "vmalloc: vstart or align are not page aligned (0x%lx, 0x%lx)\n",
> > vstart, align);
> > return ERR_PTR(-EBUSY);
> > }
> > <snip>
> >
> > to check that passed pair of vstart and align are correct.
> >
>
> Yes we better should.
>
> > >
> > > vmap_init_free_space() shows me that our existing alignment code/checks
> > > might be quite fragile.
> > >
> > It is not important to page align a first address. As i mentioned before
> > vstart and align have to be correct and we should add such check.
> >
> > >
> > > But I mainly decided to go with my patch instead because it will also work
> > > for exact allocations with align > PAGE_SIZE: most notably, when we try
> > > population of hugepages instead of base pages in __vmalloc_node_range(), by
> > > increasing the alignment. If the exact range allows for using huge pages,
> > > placing huge pages will work just fine with my modifications, while it won't
> > > with your approach.
> > >
> > > Long story short: my approach handles exact allocations also for bigger
> > > alignment, Your optimization makes sense as a general improvement for any
> > > vmalloc allocations.
> > >
> > > Thanks for having a look!
> > >
> > The problem is that there are no users(seems only KASAN) who set the range
> > that corresponds to exact size. If you add an aligment overhead on top of
>
> So there is one user and it was broken by 68ad4a330433 ("mm/vmalloc.c: keep
> track of free blocks for vmap allocation").
>
> > it means that search size should include it because we may end up with exact
> > free block and after applying aligment it will not fit. This is how allocators
> > handle aligment.
>
> This is an implementation detail of the vmalloc allocator ever since
> 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap
> allocation").
>
> If callers pass an exact range, and the alignment they specify applies, why
> should we reject such an allocation? It's leaking an implementation detail
> fixed easily internally into callers.
>
> >
> > Another approach is(you mentioned it in your commit message):
> >
> > urezki@pc638:~/data/raid0/coding/linux-next.git$ git diff mm/kasan/shadow.c
> > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> > index 082ee5b6d9a1..01d3bd76c851 100644
> > --- a/mm/kasan/shadow.c
> > +++ b/mm/kasan/shadow.c
> > @@ -200,7 +200,7 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb,
> > if (shadow_mapped(shadow_start))
> > return NOTIFY_OK;
> > - ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
> > + ret = __vmalloc_node_range(shadow_size, 1, shadow_start,
> > shadow_end, GFP_KERNEL,
> > PAGE_KERNEL, VM_NO_GUARD,
> > pfn_to_nid(mem_data->start_pfn),
> > urezki@pc638:~/data/raid0/coding/linux-next.git$
> >
> > why not? Also you can increase the range in KASAN code.
>
> No, that's leaking implementation details to the caller. And no, increasing
> the range and eventually allocating something bigger (e.g., placing a huge
> page where it might not have been possible) is not acceptable for KASAN.
>
> If you're terribly unhappy with this patch,
Sorry to say but it simple does not make sense.

>
> please suggest something reasonable to fix exact allocations:
> a) Fixes the KASAN use case.
> b) Allows for automatic placement of huge pages for exact allocations.
> c) Doesn't leak implementation details into the caller.
>
I am looking at it.

--
Vlad Rezki