Re: [PATCH] Revert "mm/vunmap: add cond_resched() in vunmap_pmd_range"

From: Minchan Kim
Date: Mon Nov 16 2020 - 12:53:29 EST


On Fri, Nov 13, 2020 at 08:25:29AM -0800, Minchan Kim wrote:
> On Thu, Nov 12, 2020 at 02:49:19PM -0800, Andrew Morton wrote:
> > On Thu, 12 Nov 2020 12:01:01 -0800 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> >
> > >
> > > On Sat, Nov 07, 2020 at 12:39:39AM -0800, Minchan Kim wrote:
> > > > Hi Andrew,
> > > >
> > > > On Fri, Nov 06, 2020 at 05:59:33PM -0800, Andrew Morton wrote:
> > > > > On Thu, 5 Nov 2020 09:02:49 -0800 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> > > > >
> > > > > > This reverts commit e47110e90584a22e9980510b00d0dfad3a83354e.
> > > > > >
> > > > > > While I was doing zram testing, I found sometimes decompression failed
> > > > > > since the compression buffer was corrupted. With investigation,
> > > > > > I found below commit calls cond_resched unconditionally so it could
> > > > > > make a problem in atomic context if the task is reschedule.
> > > > > >
> > > > > > Revert the original commit for now.
> > >
> > > How should we proceed this problem?
> > >
> >
> > (top-posting repaired - please don't).
> >
> > Well, we don't want to reintroduce the softlockup reports which
> > e47110e90584a2 fixed, and we certainly don't want to do that on behalf
> > of code which is using the unmap_kernel_range() interface incorrectly.
> >
> > So I suggest either
> >
> > a) make zs_unmap_object() stop doing the unmapping from atomic context or
>
> It's not easy since the path already hold several spin_locks as well as
> per-cpu context. I could pursuit the direction but it takes several
> steps to change entire locking scheme in the zsmalloc, which will
> take time(we couldn't leave zsmalloc broken until then) and hard to
> land on stable tree.
>
> >
> > b) figure out whether the vmalloc unmap code is *truly* unsafe from
> > atomic context - perhaps it is only unsafe from interrupt context,
> > in which case we can rework the vmalloc.c checks to reflect this, or
>
> I don't get the point. I assume your suggestion would be "let's make the
> vunmap code atomic context safe" but how could it solve this problem?
>
> The point from e47110e90584a2 was softlockup could be triggered if
> vunamp deal with large mapping so need *explict reschedule* point
> for CONFIG_PREEMPT_VOLUNTARY. However, CONFIG_PREEMPT_VOLUNTARY
> doesn't consider peempt count so even though we could make vunamp
> atomic safe to make a call under spin_lock:
>
> spin_lock(&A);
> vunmap
> vunmap_pmd_range
> cond_resched <- bang
>
> Below options would have same problem, too.
> Let me know if I misunderstand something.
>
> >
> > c) make the vfree code callable from all contexts. Which is by far
> > the preferred solution, but may be tough.
> >
> >
> > Or maybe not so tough - if the only problem in the vmalloc code is the
> > use of mutex_trylock() from irqs then it may be as simple as switching
> > to old-style semaphores and using down_trylock(), which I think is
> > irq-safe.
> >
> > However old-style semaphores are deprecated. A hackyish approach might
> > be to use an rwsem always in down_write mode and use
> > down_write_trylock(), which I think is also callable from interrrupt
> > context.
> >
> > But I have a feeling that there are other reasons why vfree shouldn't
> > be called from atomic context, apart from the mutex_trylock-in-irq
> > issue.

How about this?