Re: [PATCH] mm/vmalloc: Fix unlock order in s_stop()

From: Uladzislau Rezki
Date: Mon Dec 14 2020 - 10:12:18 EST


On Sun, Dec 13, 2020 at 09:51:34PM +0000, Matthew Wilcox wrote:
> On Sun, Dec 13, 2020 at 07:39:36PM +0100, Uladzislau Rezki wrote:
> > On Sun, Dec 13, 2020 at 01:08:43PM -0500, Waiman Long wrote:
> > > When multiple locks are acquired, they should be released in reverse
> > > order. For s_start() and s_stop() in mm/vmalloc.c, that is not the
> > > case.
> > >
> > > s_start: mutex_lock(&vmap_purge_lock); spin_lock(&vmap_area_lock);
> > > s_stop : mutex_unlock(&vmap_purge_lock); spin_unlock(&vmap_area_lock);
> > >
> > > This unlock sequence, though allowed, is not optimal. If a waiter is
> > > present, mutex_unlock() will need to go through the slowpath of waking
> > > up the waiter with preemption disabled. Fix that by releasing the
> > > spinlock first before the mutex.
> > >
> > > Fixes: e36176be1c39 ("mm/vmalloc: rework vmap_area_lock")
> > > Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> > > ---
> > > mm/vmalloc.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 6ae491a8b210..75913f685c71 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3448,11 +3448,11 @@ static void *s_next(struct seq_file *m, void *p, loff_t *pos)
> > > }
> > >
> > > static void s_stop(struct seq_file *m, void *p)
> > > - __releases(&vmap_purge_lock)
> > > __releases(&vmap_area_lock)
> > > + __releases(&vmap_purge_lock)
> > > {
> > > - mutex_unlock(&vmap_purge_lock);
> > > spin_unlock(&vmap_area_lock);
> > > + mutex_unlock(&vmap_purge_lock);
> > > }
> > >
> > > static void show_numa_info(struct seq_file *m, struct vm_struct *v)
> > BTW, if navigation over both list is an issue, for example when there
> > are multiple heavy readers of /proc/vmallocinfo, i think, it make sense
> > to implement RCU safe lists iteration and get rid of both locks.
>
> If we need to iterate the list efficiently, i'd suggest getting rid of
> the list and using an xarray instead. maybe a maple tree, once that code
> is better exercised.
>
Not really efficiently. We need just a full scan of it propagating the
information about mapped and un-purged areas to user space applications.

For example RCU-safe list is what we need, IMHO. From the other hand i
am not sure if xarray is RCU safe in a context of concurrent removing/adding
an element(xa_remove()/xa_insert()) and scanning like xa_for_each_XXX().

--
Vlad Rezki