Re: [PATCH] Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"

From: Joonsoo Kim
Date: Thu May 17 2018 - 23:04:43 EST


On Thu, May 17, 2018 at 10:53:32AM -0700, Laura Abbott wrote:
> On 05/17/2018 10:08 AM, Michal Hocko wrote:
> >On Thu 17-05-18 18:49:47, Michal Hocko wrote:
> >>On Thu 17-05-18 16:58:32, Ville Syrjälä wrote:
> >>>On Thu, May 17, 2018 at 04:36:29PM +0300, Ville Syrjälä wrote:
> >>>>On Thu, May 17, 2018 at 03:21:09PM +0200, Michal Hocko wrote:
> >>>>>On Thu 17-05-18 15:59:59, Ville Syrjala wrote:
> >>>>>>From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
> >>>>>>
> >>>>>>This reverts commit bad8c6c0b1144694ecb0bc5629ede9b8b578b86e.
> >>>>>>
> >>>>>>Make x86 with HIGHMEM=y and CMA=y boot again.
> >>>>>
> >>>>>Is there any bug report with some more details? It is much more
> >>>>>preferable to fix the issue rather than to revert the whole thing
> >>>>>right away.
> >>>>
> >>>>The machine I have in front of me right now didn't give me anything.
> >>>>Black screen, and netconsole was silent. No serial port on this
> >>>>machine unfortunately.
> >>>
> >>>Booted on another machine with serial:
> >>
> >>Could you provide your .config please?
> >>
> >>[...]
> >>>[ 0.000000] cma: Reserved 4 MiB at 0x0000000037000000
> >>[...]
> >>>[ 0.000000] BUG: Bad page state in process swapper pfn:377fe
> >>>[ 0.000000] page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
> >>
> >>OK, so this looks the be the source of the problem. -128 would be a
> >>buddy page but I do not see anything that would set the counter to -127
> >>and the real map count updates shouldn't really happen that early.
> >>
> >>Maybe CONFIG_DEBUG_VM and CONFIG_DEBUG_HIGHMEM will tell us more.
> >
> >Looking closer, I _think_ that the bug is in set_highmem_pages_init->is_highmem
> >and zone_movable_is_highmem might force CMA pages in the zone movable to
> >be initialized as highmem. And that sounds supicious to me. Joonsoo?
> >
>
> For a point of reference, arm with this configuration doesn't hit this bug
> because highmem pages are freed via the memblock interface only instead
> of iterating through each zone. It looks like the x86 highmem code
> assumes only a single highmem zone and/or it's disjoint?

Good point! Reason of the crash is that the span of MOVABLE_ZONE is
extended to whole node span for future CMA initialization, and,
normal memory is wrongly freed here.

Here goes the fix. Ville, Could you test below patch?
I re-generated the issue on my side and this patch fixed it.

Thanks.

------------>8-------------