RE: [PATCH 01/11] mm: page_alloc: set_migratetype_isolate: drain PCPprior to isolating

From: Marek Szyprowski
Date: Thu Jan 05 2012 - 14:20:39 EST


Hello,

On Thursday, January 05, 2012 4:40 PM MichaÅ Nazarewicz wrote:

> On Thu, 29 Dec 2011 13:39:02 +0100, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> > From: Michal Nazarewicz <mina86@xxxxxxxxxx>
> >
> > When set_migratetype_isolate() sets pageblock's migrate type, it does
> > not change each page_private data. This makes sense, as the function
> > has no way of knowing what kind of information page_private stores.
> >
> > Unfortunately, if a page is on PCP list, it's page_private indicates
> > its migrate type. This means, that if a page on PCP list gets
> > isolated, a call to free_pcppages_bulk() will assume it has the old
> > migrate type rather than MIGRATE_ISOLATE. This means, that a page
> > which should be isolated, will end up on a free list of it's old
> > migrate type.
> >
> > Coincidentally, at the very end, set_migratetype_isolate() calls
> > drain_all_pages() which leads to calling free_pcppages_bulk(), which
> > does the wrong thing.
> >
> > To avoid this situation, this commit moves the draining prior to
> > setting pageblock's migratetype and moving pages from old free list to
> > MIGRATETYPE_ISOLATE's free list.
> >
> > Because of spin locks this is a non-trivial change however as both
> > set_migratetype_isolate() and free_pcppages_bulk() grab zone->lock.
> > To solve this problem, this commit renames free_pcppages_bulk() to
> > __free_pcppages_bulk() and changes it so that it no longer grabs
> > zone->lock instead requiring caller to hold it. This commit later
> > adds a __zone_drain_all_pages() function which works just like
> > drain_all_pages() expects that it drains only pages from a single zone
> > and assumes that caller holds zone->lock.
>
> As it turns out, with some more testing on SMP systems, this whole patch
> turned out to be incorrect.
>
> We have been thinking about other approach and, if we were to use something
> else then the first patch from CMAv17[1], the best thing we could came up
> with was to unconditionally call drain_all_pages() at the beginning of
> set_migratetype_isolate() before the call to spin_lock_irqsave(). It has
> a possible race condition but a nightly stress test did have not shown any
> problems.
>
> Nonetheless, the cleanest, in my opinion, solution is to use the first patch
> from CMAv17 which can be found at [1].
>
> So, to sum up: if you intend to test CMAv18, instead of applying this first
> patch either use first patch from CMAv17[1] or put an unconditional call to
> drain_all_pages() at the beginning of set_migrate_isolate() function.
>
> Sorry for the troubles.
>
> [1] http://www.spinics.net/lists/arm-kernel/msg148494.html

I've updated our public git repository to include this workaround. You can
pull the patches from the following addresses:

git://git.infradead.org/users/kmpark/linux-samsung 3.2-rc7-cma-v18

http://git.infradead.org/users/kmpark/linux-samsung/shortlog/refs/heads/3.2-rc7-cma-v18

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/