Re: [PATCH RFC 3/3] mm/vmalloc.c: allow vread() to read out vm_map_ram areas

From: Baoquan He
Date: Wed Nov 30 2022 - 23:47:55 EST


On 11/30/22 at 02:06pm, Uladzislau Rezki wrote:
> On Thu, Nov 24, 2022 at 05:52:34PM +0800, Baoquan He wrote:
......
> > > I don't think you understand the problem.
> > >
> > > Task A: Task B: Task C:
> > > p = vm_map_ram()
> > > vread(p);
> > > ... preempted ...
> > > vm_unmap_ram(p);
> > > q = vm_map_ram();
> > > vread continues
> >
> >
> >
> > >
> > > If C has reused the address space allocated by A, task B is now reading
> > > the memory mapped by task C instead of task A. If it hasn't, it's now
> > > trying to read from unmapped, and quite possibly freed memory. Which
> > > might have been allocated by task D.
> >
> > Hmm, it may not be like that.
> >
> > Firstly, I would remind that vread() takes vmap_area_lock during the
> > whole reading process. Before this patchset, the vread() and other area
> > manipulation will have below status:
> > 1) __get_vm_area_node() could only finish insert_vmap_area(), then free
> > the vmap_area_lock, then warting;
> > 2) __get_vm_area_node() finishs setup_vmalloc_vm()
> > 2.1) doing mapping but not finished;
> > 2.2) clear_vm_uninitialized_flag() is called after mapping is done;
> > 3) remove_vm_area() is called to set -> = NULL, then free vmap_area_lock;
> >
> > Task A: Task B: Task C:
> > p = __get_vm_area_node()
> > remove_vm_area(p);
> > vread(p);
> >
> > vread end
> > q = __get_vm_area_node();
> >
> > So, as you can see, the checking "if (!va->vm)" in vread() will filter
> > out vmap_area:
> > a) areas if only insert_vmap_area() is called, but ->vm is still NULL;
> > b) areas if remove_vm_area() is called to clear ->vm to NULL;
> > c) areas created through vm_map_ram() since its ->vm is always NULL;
> >
> > Means vread() will read out vmap_area:
> > d) areas if setup_vmalloc_vm() is called;
> > 1) mapping is done on areas, e.g clear_vm_uninitialized_flag() is
> > called;
> > 2) mapping is being handled, just after returning from setup_vmalloc_vm();
> >
> >
> > ******* after this patchset applied:
> >
> > Task A: Task B: Task C:
> > p = vm_map_ram()
> > vm_unmap_ram(p);
> > vread(p);
> > vb_vread()
> > vread end
> >
> > q = vm_map_ram();
> >
> > With this patchset applied, other than normal areas, for the
> > vm_map_ram() areas:
> > 1) In vm_map_ram(), set vmap_area->flags = VMAP_RAM when vmap_area_lock
> > is taken; In vm_unmap_ram(), clear it wiht "va->flags &= ~VMAP_RAM"
> > when vmap_area_lock is taken;
> > 2) If vmap_block, set va->flags = VMAP_RAM|VMAP_BLOCK; And set
> > vmap_block->used_map to track the used region, filter out the dirty
> > and free region;
> > 3) In vb_vread(), we take vb->lock to avoid reading out dirty regions.
> >
> > Please help point out what is wrong or I missed.
> >
> One thing is we still can read-out un-mapped pages, i.e. a text instead:
>
> <snip>
> static void vb_free(unsigned long addr, unsigned long size)
> {
> unsigned long offset;
> unsigned int order;
> struct vmap_block *vb;
>
> BUG_ON(offset_in_page(size));
> BUG_ON(size > PAGE_SIZE*VMAP_MAX_ALLOC);
>
> flush_cache_vunmap(addr, addr + size);
>
> order = get_order(size);
> offset = (addr & (VMAP_BLOCK_SIZE - 1)) >> PAGE_SHIFT;
> vb = xa_load(&vmap_blocks, addr_to_vb_idx(addr));
>
> vunmap_range_noflush(addr, addr + size);
>
> if (debug_pagealloc_enabled_static())
> flush_tlb_kernel_range(addr, addr + size);
>
> spin_lock(&vb->lock);
> ...
> <snip>
>
> or am i missing something? Is it a problem? It might be. Another thing
> it would be good if you upload a new patchset so it is easier to review
> it.

Thanks for checking.

Please check patch 1, vmap_block->used_map is introduced to track the
vb regions allocation and free via vb_alloc/vb_free(). The vb->used_map
only set for pages being used, the dirty and free regions are all
cleared. In the added vb_vread() of patch 3, vb->lock is required and
taken during the whole vb vmap reading, and only page of regions set in
vb->used_map will be read out.

So if vb_free() is called, and vb->used_map is cleared away, it won't
be read out in vb_vread(). If vb_free() is requiring vb->lock and waiting,
the region hasn't been unmapped and can be read out.

@@ -2114,6 +2118,9 @@ static void vb_free(unsigned long addr, unsigned long size)
order = get_order(size);
offset = (addr & (VMAP_BLOCK_SIZE - 1)) >> PAGE_SHIFT;
vb = xa_load(&vmap_blocks, addr_to_vb_idx(addr));
+ spin_lock(&vb->lock);
+ bitmap_clear(vb->used_map, offset, (1UL << order));
+ spin_unlock(&vb->lock);

vunmap_range_noflush(addr, addr + size);

I will work out a formal patchset for reviewing, will post and CC all
reviewers.

Thanks
Baoquan