Re: [PATCH] mm: avoid slub allocation while holding list_lock

From: Kirill A. Shutemov
Date: Tue Sep 10 2019 - 05:16:29 EST


On Mon, Sep 09, 2019 at 03:39:38PM -0600, Yu Zhao wrote:
> On Tue, Sep 10, 2019 at 05:57:22AM +0900, Tetsuo Handa wrote:
> > On 2019/09/10 1:00, Kirill A. Shutemov wrote:
> > > On Mon, Sep 09, 2019 at 12:10:16AM -0600, Yu Zhao wrote:
> > >> If we are already under list_lock, don't call kmalloc(). Otherwise we
> > >> will run into deadlock because kmalloc() also tries to grab the same
> > >> lock.
> > >>
> > >> Instead, allocate pages directly. Given currently page->objects has
> > >> 15 bits, we only need 1 page. We may waste some memory but we only do
> > >> so when slub debug is on.
> > >>
> > >> WARNING: possible recursive locking detected
> > >> --------------------------------------------
> > >> mount-encrypted/4921 is trying to acquire lock:
> > >> (&(&n->list_lock)->rlock){-.-.}, at: ___slab_alloc+0x104/0x437
> > >>
> > >> but task is already holding lock:
> > >> (&(&n->list_lock)->rlock){-.-.}, at: __kmem_cache_shutdown+0x81/0x3cb
> > >>
> > >> other info that might help us debug this:
> > >> Possible unsafe locking scenario:
> > >>
> > >> CPU0
> > >> ----
> > >> lock(&(&n->list_lock)->rlock);
> > >> lock(&(&n->list_lock)->rlock);
> > >>
> > >> *** DEADLOCK ***
> > >>
> > >> Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> > >
> > > Looks sane to me:
> > >
> > > Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > >
> >
> > Really?
> >
> > Since page->objects is handled as bitmap, alignment should be BITS_PER_LONG
> > than BITS_PER_BYTE (though in this particular case, get_order() would
> > implicitly align BITS_PER_BYTE * PAGE_SIZE). But get_order(0) is an
> > undefined behavior.
>
> I think we can safely assume PAGE_SIZE is unsigned long aligned and
> page->objects is non-zero.

I think it's better to handle page->objects == 0 gracefully. It should not
happen, but this code handles situation that should not happen.

--
Kirill A. Shutemov