Re: [PATCH] mm: create a separate slab for page->ptl allocation
From: Kirill A. Shutemov
Date: Tue Nov 05 2013 - 18:39:04 EST
[ sorry, resend to all ]
On Tue, Nov 05, 2013 at 03:01:45PM -0800, Andrew Morton wrote:
> On Tue, 22 Oct 2013 14:53:59 +0300 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
>
> > If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64
> > is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab,
> > so we loose 24 on each. An average system can easily allocate few tens
> > thousands of page->ptl and overhead is significant.
> >
> > Let's create a separate slab for page->ptl allocation to solve this.
> >
> > ...
> >
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -4332,11 +4332,19 @@ void copy_user_huge_page(struct page *dst, struct page *src,
> > #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */
> >
> > #if USE_SPLIT_PTE_PTLOCKS
> > +struct kmem_cache *page_ptl_cachep;
> > +void __init ptlock_cache_init(void)
> > +{
> > + if (sizeof(spinlock_t) > sizeof(long))
> > + page_ptl_cachep = kmem_cache_create("page->ptl",
> > + sizeof(spinlock_t), 0, SLAB_PANIC, NULL);
> > +}
>
> Confused. If (sizeof(spinlock_t) > sizeof(long)) happens to be false
> then the kernel will later crash. It would be better to use BUILD_BUG_ON()
> here, if that works. Otherwise BUG_ON.
if (sizeof(spinlock_t) > sizeof(long)) is false, we don't need dynamicly
allocate page->ptl. It's embedded to struct page itself. __ptlock_alloc()
never called in this case.
> Also, we have the somewhat silly KMEM_CACHE() macro, but it looks
> inapplicable here?
The first argument of KMEM_CACHE() is struct name, but we have typedef
here.
> > bool __ptlock_alloc(struct page *page)
> > {
> > spinlock_t *ptl;
> >
> > - ptl = kmalloc(sizeof(spinlock_t), GFP_KERNEL);
> > + ptl = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
> > if (!ptl)
> > return false;
> > page->ptl = (unsigned long)ptl;
> > @@ -4346,6 +4354,6 @@ bool __ptlock_alloc(struct page *page)
> > void __ptlock_free(struct page *page)
> > {
> > if (sizeof(spinlock_t) > sizeof(page->ptl))
> > - kfree((spinlock_t *)page->ptl);
> > + kmem_cache_free(page_ptl_cachep, (spinlock_t *)page->ptl);
>
> A void* cast would suffice here, but I suppose the spinlock_t* cast has
> some documentation value.
Right.
--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/