Re: [PATCH v9 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU

From: Suren Baghdasaryan
Date: Wed Jan 15 2025 - 00:42:22 EST


On Tue, Jan 14, 2025 at 7:58 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote:
>
> * Suren Baghdasaryan <surenb@xxxxxxxxxx> [250114 22:15]:
> > On Tue, Jan 14, 2025 at 6:27 PM Wei Yang <richard.weiyang@xxxxxxxxx> wrote:
> > >
> > > On Fri, Jan 10, 2025 at 08:26:03PM -0800, Suren Baghdasaryan wrote:
> > >
> > > >diff --git a/kernel/fork.c b/kernel/fork.c
> > > >index 9d9275783cf8..151b40627c14 100644
> > > >--- a/kernel/fork.c
> > > >+++ b/kernel/fork.c
> > > >@@ -449,6 +449,42 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
> > > > return vma;
> > > > }
> > > >
> > > >+static void vm_area_init_from(const struct vm_area_struct *src,
> > > >+ struct vm_area_struct *dest)
> > > >+{
> > > >+ dest->vm_mm = src->vm_mm;
> > > >+ dest->vm_ops = src->vm_ops;
> > > >+ dest->vm_start = src->vm_start;
> > > >+ dest->vm_end = src->vm_end;
> > > >+ dest->anon_vma = src->anon_vma;
> > > >+ dest->vm_pgoff = src->vm_pgoff;
> > > >+ dest->vm_file = src->vm_file;
> > > >+ dest->vm_private_data = src->vm_private_data;
> > > >+ vm_flags_init(dest, src->vm_flags);
> > > >+ memcpy(&dest->vm_page_prot, &src->vm_page_prot,
> > > >+ sizeof(dest->vm_page_prot));
> > > >+ /*
> > > >+ * src->shared.rb may be modified concurrently when called from
> > > >+ * dup_mmap(), but the clone will reinitialize it.
> > > >+ */
> > > >+ data_race(memcpy(&dest->shared, &src->shared, sizeof(dest->shared)));
> > > >+ memcpy(&dest->vm_userfaultfd_ctx, &src->vm_userfaultfd_ctx,
> > > >+ sizeof(dest->vm_userfaultfd_ctx));
> > > >+#ifdef CONFIG_ANON_VMA_NAME
> > > >+ dest->anon_name = src->anon_name;
> > > >+#endif
> > > >+#ifdef CONFIG_SWAP
> > > >+ memcpy(&dest->swap_readahead_info, &src->swap_readahead_info,
> > > >+ sizeof(dest->swap_readahead_info));
> > > >+#endif
> > > >+#ifndef CONFIG_MMU
> > > >+ dest->vm_region = src->vm_region;
> > > >+#endif
> > > >+#ifdef CONFIG_NUMA
> > > >+ dest->vm_policy = src->vm_policy;
> > > >+#endif
> > > >+}
> > >
> > > Would this be difficult to maintain? We should make sure not miss or overwrite
> > > anything.
> >
> > Yeah, it is less maintainable than a simple memcpy() but I did not
> > find a better alternative. I added a warning above the struct
> > vm_area_struct definition to update this function every time we change
> > that structure. Not sure if there is anything else I can do to help
> > with this.
>
> Here's a horrible idea.. if we put the ref count at the end or start of
> the struct, we could set the ref count to zero and copy the other area
> in one mmecpy().
>
> Even worse idea, we could use a pointer_of like macro to get the position
> of the ref count in the vma struct, set the ref count to zero and
> carefully copy the other two parts in two memcpy() operations.

I implemented this approach in v3 of this patchset here:
https://lore.kernel.org/all/20241117080931.600731-5-surenb@xxxxxxxxxx/
like this:

#define VMA_BEFORE_LOCK offsetof(struct vm_area_struct, vm_lock)
#define VMA_LOCK_END(vma) \
(((void *)(vma)) + offsetofend(struct vm_area_struct, vm_lock))
#define VMA_AFTER_LOCK \
(sizeof(struct vm_area_struct) - offsetofend(struct
vm_area_struct, vm_lock))

static inline void vma_copy(struct vm_area_struct *new, struct
vm_area_struct *orig)
{
/* Preserve vma->vm_lock */
data_race(memcpy(new, orig, VMA_BEFORE_LOCK));
data_race(memcpy(VMA_LOCK_END(new), VMA_LOCK_END(orig),
VMA_AFTER_LOCK));
}

If this looks more maintainable I can revive it.
Maybe introduce a more generic function to copy any structure
excluding a specific field and use it like this:

copy_struct_except(new, orig, struct vm_area_struct, vm_refcnt);

Would that be better?

>
> Feel free to disregard these ideas as it is late here and I'm having
> fun thinking up bad ways to make this "more" maintainable.
>
> Either of these would make updating the struct easier, but very painful
> to debug when it goes wrong (or reading the function).
>
> Thanks,
> Liam